scispace - formally typeset
Search or ask a question

Showing papers on "Task analysis published in 2018"


Proceedings ArticleDOI
19 Feb 2018
TL;DR: In this article, the authors make the observation that the performance of multi-task learning is strongly dependent on the relative weighting between each task's loss, and propose a principled approach to weight multiple loss functions by considering the homoscedastic uncertainty of each task.
Abstract: Numerous deep learning applications benefit from multitask learning with multiple regression and classification objectives. In this paper we make the observation that the performance of such systems is strongly dependent on the relative weighting between each task's loss. Tuning these weights by hand is a difficult and expensive process, making multi-task learning prohibitive in practice. We propose a principled approach to multi-task deep learning which weighs multiple loss functions by considering the homoscedastic uncertainty of each task. This allows us to simultaneously learn various quantities with different units or scales in both classification and regression settings. We demonstrate our model learning per-pixel depth regression, semantic and instance segmentation from a monocular input image. Perhaps surprisingly, we show our model can learn multi-task weightings and outperform separate models trained individually on each task.

1,515 citations


Proceedings ArticleDOI
18 Jun 2018
TL;DR: In this article, the authors propose a taxonomic map for task transfer learning, which is a set of tools for computing and probing this taxonomical structure including a solver to find supervision policies for their use cases.
Abstract: Do visual tasks have a relationship, or are they unrelated? For instance, could having surface normals simplify estimating the depth of an image? Intuition answers these questions positively, implying existence of a structure among visual tasks. Knowing this structure has notable values; it is the concept underlying transfer learning and provides a principled way for identifying redundancies across tasks, e.g., to seamlessly reuse supervision among related tasks or solve many tasks in one system without piling up the complexity. We proposes a fully computational approach for modeling the structure of space of visual tasks. This is done via finding (first and higher-order) transfer learning dependencies across a dictionary of twenty six 2D, 2.5D, 3D, and semantic tasks in a latent space. The product is a computational taxonomic map for task transfer learning. We study the consequences of this structure, e.g. nontrivial emerged relationships, and exploit them to reduce the demand for labeled data. We provide a set of tools for computing and probing this taxonomical structure including a solver users can employ to find supervision policies for their use cases.

971 citations


Journal ArticleDOI
TL;DR: This paper investigates the task offloading problem in ultra-dense network aiming to minimize the delay while saving the battery life of user’s equipment and proposes an efficient offloading scheme which can reduce 20% of the task duration with 30% energy saving.
Abstract: With the development of recent innovative applications (e.g., augment reality, self-driving, and various cognitive applications), more and more computation-intensive and data-intensive tasks are delay-sensitive. Mobile edge computing in ultra-dense network is expected as an effective solution for meeting the low latency demand. However, the distributed computing resource in edge cloud and energy dynamics in the battery of mobile device makes it challenging to offload tasks for users. In this paper, leveraging the idea of software defined network, we investigate the task offloading problem in ultra-dense network aiming to minimize the delay while saving the battery life of user’s equipment. Specifically, we formulate the task offloading problem as a mixed integer non-linear program which is NP-hard. In order to solve it, we transform this optimization problem into two sub-problems, i.e., task placement sub-problem and resource allocation sub-problem. Based on the solution of the two sub-problems, we propose an efficient offloading scheme. Simulation results prove that the proposed scheme can reduce 20% of the task duration with 30% energy saving, compared with random and uniform task offloading schemes.

821 citations


Proceedings ArticleDOI
18 Jun 2018
TL;DR: A new AI task where an agent is spawned at a random location in a 3D environment and asked a question ('What color is the car?'), and the agent must first intelligently navigate to explore the environment, gather necessary visual information through first-person (egocentric) vision, and then answer the question.
Abstract: We present a new AI task - Embodied Question Answering (EmbodiedQA) - where an agent is spawned at a random location in a 3D environment and asked a question ('What color is the car?'). In order to answer, the agent must first intelligently navigate to explore the environment, gather necessary visual information through first-person (egocentric) vision, and then answer the question ('orange'). EmbodiedQA requires a range of AI skills - language understanding, visual recognition, active perception, goal-driven navigation, commonsense reasoning, long-term memory, and grounding language into actions. In this work, we develop a dataset of questions and answers in House3D environments [1], evaluation metrics, and a hierarchical model trained with imitation and reinforcement learning.

368 citations


Journal ArticleDOI
TL;DR: This survey provides a comprehensive discussion of all aspects of MAS, starting from definitions, features, applications, challenges, and communications to evaluation, and a classification on MAS applications and challenges is provided.
Abstract: Multi-agent systems (MASs) have received tremendous attention from scholars in different disciplines, including computer science and civil engineering, as a means to solve complex problems by subdividing them into smaller tasks. The individual tasks are allocated to autonomous entities, known as agents. Each agent decides on a proper action to solve the task using multiple inputs, e.g., history of actions, interactions with its neighboring agents, and its goal. The MAS has found multiple applications, including modeling complex systems, smart grids, and computer networks. Despite their wide applicability, there are still a number of challenges faced by MAS, including coordination between agents, security, and task allocation. This survey provides a comprehensive discussion of all aspects of MAS, starting from definitions, features, applications, challenges, and communications to evaluation. A classification on MAS applications and challenges is provided along with references for further studies. We expect this paper to serve as an insightful and comprehensive resource on the MAS for researchers and practitioners in the area.

290 citations


Proceedings ArticleDOI
18 Jun 2018
TL;DR: Li et al. as mentioned in this paper proposed a novel framework for self-supervised learning that overcomes limitations in designing and comparing different tasks, models, and data domains by decoupling the structure of the self-vised model from the final task-specific fine-tuned model.
Abstract: In self-supervised learning, one trains a model to solve a so-called pretext task on a dataset without the need for human annotation. The main objective, however, is to transfer this model to a target domain and task. Currently, the most effective transfer strategy is fine-tuning, which restricts one to use the same model or parts thereof for both pretext and target tasks. In this paper, we present a novel framework for self-supervised learning that overcomes limitations in designing and comparing different tasks, models, and data domains. In particular, our framework decouples the structure of the self-supervised model from the final task-specific fine-tuned model. This allows us to: 1) quantitatively assess previously incompatible models including handcrafted features; 2) show that deeper neural network models can learn better representations from the same pretext task; 3) transfer knowledge learned with a deep model to a shallower one and thus boost its learning. We use this framework to design a novel self-supervised task, which achieves state-of-the-art performance on the common benchmarks in PASCAL VOC 2007, ILSVRC12 and Places by a significant margin. Our learned features shrink the mAP gap between models trained via self-supervised learning and supervised learning from 5.9% to 2.6% in object detection on PASCAL VOC 2007.

244 citations


Proceedings ArticleDOI
23 Apr 2018
TL;DR: Imitating expert demonstration is a powerful mechanism for learning to perform tasks from raw sensory observations as discussed by the authors, where the expert typically provides multiple demonstrations of a task at training time, and this generates data in the form of observation-action pairs from the agent's point of view.
Abstract: Imitating expert demonstration is a powerful mechanism for learning to perform tasks from raw sensory observations. The current dominant paradigm in learning from demonstration (LfD) [3,16,19,20] requires the expert to either manually move the robot joints (i.e., kinesthetic teaching) or teleoperate the robot to execute the desired task. The expert typically provides multiple demonstrations of a task at training time, and this generates data in the form of observation-action pairs from the agent's point of view. The agent then distills this data into a policy for performing the task of interest. Such a heavily supervised approach, where it is necessary to provide demonstrations by controlling the robot, is incredibly tedious for the human expert. Moreover, for every new task that the robot needs to execute, the expert is required to provide a new set of demonstrations.

238 citations


Journal ArticleDOI
TL;DR: This paper introduces a new concept of task caching, and proposes efficient algorithm, called task caching and offloading (TCO), based on alternating iterative algorithm, which outperforms others in terms of less energy cost.
Abstract: While augment reality applications are becoming popular, more and more data-hungry and computation-intensive tasks are delay-sensitive. Mobile edge computing is expected to an effective solution to meet the low latency demand. In contrast to previous work on mobile edge computing, which mainly focus on computation offloading, this paper introduces a new concept of task caching. Task caching refers to the caching of completed task application and their related data in edge cloud. Then, we investigate the problem of joint optimization of task caching and offloading on edge cloud with the computing and storage resource constraint. We formulate this problem as mixed integer programming which is hard to solve. To solve the problem, we propose efficient algorithm, called task caching and offloading (TCO), based on alternating iterative algorithm. Finally, the simulation experimental results show that our proposed TCO algorithm outperforms others in terms of less energy cost.

219 citations


Proceedings ArticleDOI
18 Jun 2018
TL;DR: A novel hierarchical reinforcement learning framework for video captioning, where a high-level Manager module learns to design sub-goals and a low-level Worker module recognizes the primitive actions to fulfill the sub-goal.
Abstract: Video captioning is the task of automatically generating a textual description of the actions in a video. Although previous work (e.g. sequence-to-sequence model) has shown promising results in abstracting a coarse description of a short video, it is still very challenging to caption a video containing multiple fine-grained actions with a detailed description. This paper aims to address the challenge by proposing a novel hierarchical reinforcement learning framework for video captioning, where a high-level Manager module learns to design sub-goals and a low-level Worker module recognizes the primitive actions to fulfill the sub-goal. With this compositional framework to reinforce video captioning at different levels, our approach significantly outperforms all the baseline methods on a newly introduced large-scale dataset for fine-grained video captioning. Furthermore, our non-ensemble model has already achieved the state-of-the-art results on the widely-used MSR-VTT dataset.

218 citations


Proceedings ArticleDOI
18 Jun 2018
TL;DR: In this article, the authors introduce CNN architectures for both binary and multi-way cross-modal face and audio matching, and compare dynamic and static testing with human testing as a baseline to calibrate the difficulty of the task.
Abstract: We introduce a seemingly impossible task: given only an audio clip of someone speaking, decide which of two face images is the speaker. In this paper we study this, and a number of related cross-modal tasks, aimed at answering the question: how much can we infer from the voice about the face and vice versa? We study this task "in the wild", employing the datasets that are now publicly available for face recognition from static images (VGGFace) and speaker identification from audio (VoxCeleb). These provide training and testing scenarios for both static and dynamic testing of cross-modal matching. We make the following contributions: (i) we introduce CNN architectures for both binary and multi-way cross-modal face and audio matching: (ii) we compare dynamic testing (where video information is available, but the audio is not from the same video) with static testing (where only a single still image is available): and (iii) we use human testing as a baseline to calibrate the difficulty of the task. We show that a CNN can indeed be trained to solve this task in both the static and dynamic scenarios, and is even well above chance on 10-way classification of the face given the voice. The CNN matches human performance on easy examples (e.g. different gender across faces) but exceeds human performance on more challenging examples (e.g. faces with the same gender, age and nationality).

199 citations


Journal ArticleDOI
Jianhui Liu1, Qi Zhang1
TL;DR: Three algorithms are designed based on heuristic search, reformulation linearization technique and semi-definite relaxation and solve the problem through optimizing EN candidates selection, offloading ordering and task allocation to strike a good balance between the latency and reliability in uRLLC.
Abstract: The ultra-reliable low latency communications (uRLLC) in the fifth generation mobile communication system aims to support diverse emerging applications with strict requirements of latency and reliability. Mobile edge computing (MEC) is considered as a promising solution to reduce the latency of computation-intensive tasks leveraging powerful computing units at short distance. The state-of-art work on task offloading to MEC mainly focuses on the tradeoff between latency and energy consumption, rather than reliability. In this paper, the tradeoff between the latency and reliability in task offloading to MEC is studied. A framework is provided, where user equipment partitions a task into sub-tasks and offloads them to multiple nearby edge nodes (ENs) in sequence. In this framework, we formulate an optimization problem to jointly minimize the latency and offloading failure probability. Since the formulated problem is nonconvex, we design three algorithms based on heuristic search , reformulation linearization technique and semi-definite relaxation , respectively, and solve the problem through optimizing EN candidates selection , offloading ordering and task allocation . Compared with the previous work, the numerical simulation results show that the proposed algorithms strike a good balance between the latency and reliability in uRLLC. Among them, the Heuristic Algorithm achieves the best performance in terms of the latency and reliability with the minimal complexity.

Proceedings ArticleDOI
21 May 2018
TL;DR: Neural Task Programming (NTP) as mentioned in this paper takes as input a task specification (e.g., video demonstration of a task) and recursively decomposes it into finer sub-task specifications, where bottom-level programs are callable subroutines that interact with the environment.
Abstract: In this work, we propose a novel robot learning framework called Neural Task Programming (NTP), which bridges the idea of few-shot learning from demonstration and neural program induction. NTP takes as input a task specification (e.g., video demonstration of a task) and recursively decomposes it into finer sub-task specifications. These specifications are fed to a hierarchical neural program, where bottom-level programs are callable subroutines that interact with the environment. We validate our method in three robot manipulation tasks. NTP achieves strong generalization across sequential tasks that exhibit hierarchal and compositional structures. The experimental results show that NTP learns to generalize well towards unseen tasks with increasing lengths, variable topologies, and changing objectives.stanfordvl.github.io/ntp/.

Journal ArticleDOI
TL;DR: A comprehensive analytical model that considers circuit, computation, offloading energy consumptions is developed for accurately evaluating the overall energy efficiency (EE) in homogeneous fog networks and a maximal energy-efficient task scheduling (MEETS) algorithm is proposed.
Abstract: A homogeneous fog network is defined as a group of peer nodes with sharable computing and storage resources, as well as spare spectrum for node-to-node/device-to-device communications and task scheduling. It promotes more intelligent applications and services in different Internet of Things (IoT) scenarios, thanks to effective collaborations among neighboring fog nodes via cognitive spectrum access techniques. In this paper, a comprehensive analytical model that considers circuit, computation, offloading energy consumptions is developed for accurately evaluating the overall energy efficiency (EE) in homogeneous fog networks. With this model, the tradeoff relationship between performance gains and energy costs in collaborative task offloading is investigated, thus enabling us to formulate the EE optimization problem for future intelligent IoT applications with practical constraints in available computing resources at helper nodes and unused spectrum in neighboring environments. Based on rigorous mathematical analysis, a maximal energy-efficient task scheduling (MEETS) algorithm is proposed to derive the optimal scheduling decision for a task node and multiple neighboring helper nodes under feasible modulation schemes and time allocations. Extensive simulation results demonstrate the tradeoff relationship between EE and task scheduling performance in homogeneous fog networks. Compared with traditional task scheduling strategies, the proposed MEETS algorithm can achieve much better EE performance under different network parameters and service conditions.

Proceedings ArticleDOI
05 Nov 2018
TL;DR: The proposed method, called RouteNet, can either evaluate the overall routability of cell placement solutions without global routing or predict the locations of DRC (Design Rule Checking) hotspots, and significantly outperforms other machine learning approaches such as support vector machine and logistic regression.
Abstract: Early routability prediction helps designers and tools perform preventive measures so that design rule violations can be avoided in a proactive manner. However, it is a huge challenge to have a predictor that is both accurate and fast. In this work, we study how to leverage convolutional neural network to address this challenge. The proposed method, called RouteNet, can either evaluate the overall routability of cell placement solutions without global routing or predict the locations of DRC (Design Rule Checking) hotspots. In both cases, large macros in mixed-size designs are taken into consideration. Experiments on benchmark circuits show that RouteNet can forecast overall routability with accuracy similar to that of global router while using substantially less runtime. For DRC hotspot prediction, RouteNet improves accuracy by 50% compared to global routing. It also significantly outperforms other machine learning approaches such as support vector machine and logistic regression.

Proceedings ArticleDOI
Qi Wu1, Peng Wang, Chunhua Shen1, Ian Reid1, Anton van den Hengel1 
18 Jun 2018
TL;DR: A novel approach that combines Reinforcement Learning and Generative Adversarial Networks (GANS) to generate more human-like responses to questions to overcome the relative paucity of training data, and the tendency of the typical MLE-based approach to generate overly terse answers.
Abstract: The visual dialog task requires an agent to engage in a conversation about an image with a human. It represents an extension of the visual question answering task in that the agent needs to answer a question about an image, but it needs to do so in light of the previous dialog that has taken place. The key challenge in visual dialog is thus maintaining a consistent, and natural dialog while continuing to answer questions correctly. We present a novel approach that combines Reinforcement Learning and Generative Adversarial Networks (GANS) to generate more human-like responses to questions. The GAN helps overcome the relative paucity of training data, and the tendency of the typical MLE-based approach to generate overly terse answers. Critically, the GAN is tightly integrated into the attention mechanism that generates human-interpretable reasons for each answer. This means that the discriminative model of the GAN has the task of assessing whether a candidate answer is generated by a human or not, given the provided reason. This is significant because it drives the generative model to produce high quality answers that are well supported by the associated reasoning. The method also generates the state-of-the-art results on the primary benchmark.

Journal ArticleDOI
TL;DR: It is shown that exploiting unlabeled data consistently leads to better emotion recognition performance across all emotional dimensions, and the effect of adversarial training on the feature representation across the proposed deep learning architecture is visualize.
Abstract: The performance of speech emotion recognition is affected by the differences in data distributions between train (source domain) and test (target domain) sets used to build and evaluate the models. This is a common problem, as multiple studies have shown that the performance of emotional classifiers drops when they are exposed to data that do not match the distribution used to build the emotion classifiers. The difference in data distributions becomes very clear when the training and testing data come from different domains, causing a large performance gap between development and testing performance. Due to the high cost of annotating new data and the abundance of unlabeled data, it is crucial to extract as much useful information as possible from the available unlabeled data. This study looks into the use of adversarial multitask training to extract a common representation between train and test domains. The primary task is to predict emotional-attribute-based descriptors for arousal, valence, or dominance. The secondary task is to learn a common representation, where the train and test domains cannot be distinguished. By using a gradient reversal layer, the gradients coming from the domain classifier are used to bring the source and target domain representations closer. We show that exploiting unlabeled data consistently leads to better emotion recognition performance across all emotional dimensions. We visualize the effect of adversarial training on the feature representation across the proposed deep learning architecture. The analysis shows that the data representations for the train and test domains converge as the data are passed to deeper layers of the network. We also evaluate the difference in performance when we use a shallow neural network versus a deep neural network and the effect of the number of shared layers used by the task and domain classifiers.

Proceedings ArticleDOI
14 Apr 2018
TL;DR: This work describes a convolutional neural network (CNN) based framework for sound event detection and classification using weakly labeled audio data and proposes methods to learn representations using this model which can be effectively used for solving the target task.
Abstract: In this work we propose approaches to effectively transfer knowledge from weakly labeled web audio data. We first describe a convolutional neural network (CNN) based framework for sound event detection and classification using weakly labeled audio data. Our model trains efficiently from audios of variable lengths; hence, it is well suited for transfer learning. We then propose methods to learn representations using this model which can be effectively used for solving the target task. We study both transductive and inductive transfer learning tasks, showing the effectiveness of our methods for both domain and task adaptation. We show that the learned representations using the proposed CNN model generalizes well enough to reach human level accuracy on ESC-50 sound events dataset and sets state of art results on this dataset. We further use them for acoustic scene classification task and once again show that our proposed approaches suit well for this task as well. We also show that our methods are helpful in capturing semantic meanings and relations as well. Moreover, in this process we also set state-of-art results on Audioset dataset using balanced training set.

Proceedings ArticleDOI
01 Nov 2018
TL;DR: This work compares four objectives—language modeling, translation, skip-thought, and autoencoding—on their ability to induce syntactic and part-of-speech information, holding constant the quantity and genre of the training data, as well as the LSTM architecture.
Abstract: Recently, researchers have found that deep LSTMs trained on tasks like machine translation learn substantial syntactic and semantic information about their input sentences, including part-of-speech. These findings begin to shed light on why pretrained representations, like ELMo and CoVe, are so beneficial for neural language understanding models. We still, though, do not yet have a clear understanding of how the choice of pretraining objective affects the type of linguistic information that models learn. With this in mind, we compare four objectives—language modeling, translation, skip-thought, and autoencoding—on their ability to induce syntactic and part-of-speech information, holding constant the quantity and genre of the training data, as well as the LSTM architecture.

Proceedings ArticleDOI
18 Jun 2018
TL;DR: In this paper, the authors apply well-established kernel methods to learn a non-linear mapping between the feature and attribute spaces, and propose an easy learning objective inspired by the Linear Discriminant Analysis, Kernel-Target Alignment and Kernel Polarization methods.
Abstract: In this paper, we address an open problem of zero-shot learning. Its principle is based on learning a mapping that associates feature vectors extracted from i.e. images and attribute vectors that describe objects and/or scenes of interest. In turns, this allows classifying unseen object classes and/or scenes by matching feature vectors via mapping to a newly defined attribute vector describing a new class. Due to importance of such a learning task, there exist many methods that learn semantic, probabilistic, linear or piece-wise linear mappings. In contrast, we apply well-established kernel methods to learn a non-linear mapping between the feature and attribute spaces. We propose an easy learning objective inspired by the Linear Discriminant Analysis, Kernel-Target Alignment and Kernel Polarization methods [12, 8, 4] that promotes incoherence. We evaluate the performance of our algorithm on the Polynomial as well as shift-invariant Gaussian and Cauchy kernels. Despite simplicity of our approach, we obtain state-of-the-art results on several zero-shot learning datasets and benchmarks including a recent AWA2 dataset [45].

Journal ArticleDOI
TL;DR: In this paper, a student presentation task to introduce and raise awareness of Global Englishes in a Japanese English language classroom is described. But it is not shown that the presentation task allowed students to select and explore Englishes salient to their experiences and interests, and by listening to their classmates' presentations, the task raised students' awareness of variation in English and challenged attitudes towards Englishes that differed from standard models presented in typical ELT materials in Japan.
Abstract: Increasing students’ awareness of the globalization of English is a daunting task for teachers, especially considering the lack of globally oriented ELT materials available. This study builds on previous research in response to recent calls for more classroom-level research, and reports on the use of a student presentation task to introduce and raise awareness of Global Englishes in a Japanese English language classroom. An analysis of student reflections showed that the presentation task allowed students to select and explore Englishes salient to their experiences and interests. In researching and imparting knowledge of their chosen variety, and by listening to their classmates’ presentations, the task raised students’ awareness of variation in English, and challenged attitudes towards Englishes that differed from standard models presented in typical ELT materials in Japan. Tasks such as the one presented here provide practitioners with avenues to incorporate Global Englishes into classroom practice.

Journal ArticleDOI
TL;DR: This work shows how the HMM can be inferred on continuous, parcellated source-space Magnetoencephalography (MEG) task data in an unsupervised manner, without any knowledge of the task timings, and reveals task-dependent HMM states that represent whole-brain dynamic networks transiently bursting at millisecond time scales as cognition unfolds.
Abstract: Complex thought and behaviour arise through dynamic recruitment of large-scale brain networks The signatures of this process may be observable in electrophysiological data; yet robust modelling of rapidly changing functional network structure on rapid cognitive timescales remains a considerable challenge Here, we present one potential solution using Hidden Markov Models (HMMs), which are able to identify brain states characterised by engaging distinct functional networks that reoccur over time We show how the HMM can be inferred on continuous, parcellated source-space Magnetoencephalography (MEG) task data in an unsupervised manner, without any knowledge of the task timings We apply this to a freely available MEG dataset in which participants completed a face perception task, and reveal task-dependent HMM states that represent whole-brain dynamic networks transiently bursting at millisecond time scales as cognition unfolds The analysis pipeline demonstrates a general way in which the HMM can be used to do a statistically valid whole-brain, group-level task analysis on MEG task data, which could be readily adapted to a wide range of task-based studies

Journal ArticleDOI
TL;DR: The future trends and open issues of SC task allocation are investigated, including skill-based task allocation, group recommendation and collaboration, task composition and decomposition, and privacy-preserving task allocation.
Abstract: Spatial crowdsourcing (SC) is an emerging paradigm of crowdsourcing, which commits workers to move to some particular locations to perform spatio-temporal-relevant tasks (e.g., sensing and activity organization). Task allocation or worker selection is a significant problem that may impact the quality of completion of SC tasks. Based on a conceptual model and generic framework of SC task allocation, this paper first gives a review of the current state of research in this field, including single task allocation, multiple task allocation, low-cost task allocation, and quality-enhanced task allocation. We further investigate the future trends and open issues of SC task allocation, including skill-based task allocation, group recommendation and collaboration, task composition and decomposition, and privacy-preserving task allocation. Finally, we discuss the practical issues on real-world deployment as well as the challenges for large-scale user study in SC task allocation.

Journal ArticleDOI
TL;DR: This paper proposes an optimization framework that generates task assignments and schedules for a human–robot team with the goal of improving both time and ergonomics and demonstrates its use in six real-world manufacturing processes that are currently performed manually.
Abstract: As collaborative robots begin to appear on factory floors, there is a need to consider how these robots can best help their human partners. In this paper, we propose an optimization framework that generates task assignments and schedules for a human–robot team with the goal of improving both time and ergonomics and demonstrate its use in six real-world manufacturing processes that are currently performed manually. Using the strain index method to quantify human physical stress, we create a set of solutions with assigned priorities on each goal. The resulting schedules provide engineers with insight into selecting the appropriate level of compromise and integrating the robot in a way that best fits the needs of an individual process. Note to Practitioners —Collaborative robots promise many advantages on the shop and factory floor, including low-cost automation and flexibility in small-batch production. Using this technology requires engineers to redesign tasks that are currently performed by human workers to effectively involve human and robot workers. However, existing quantitative methods for scheduling and allocating tasks to multiple workers do not consider factors, such as differences in skill between human and robot workers or the differential ergonomic impact of tasks on workers. We propose a method to analyze how the inclusion of a collaborative robot in an existing process might affect the makespan of the task and the physical strain the task places on the human worker. The method enables the engineer to prioritize and weigh makespan and worker ergonomics in creating schedules and inspect the resulting task schedules. Using this method, engineers can determine how the addition of a collaborative robot might improve makespan and/or reduce job risk and potential for occupational hazard for human workers, particularly in tasks that involve high physical strain. We apply our method to six real-world tasks from various industries to demonstrate its use and discuss its practical limitations. In our future work, we plan to develop a software tool that will assist engineers in the use of our method.

Proceedings ArticleDOI
18 Jun 2018
TL;DR: A local constraint regularized multitask network, called Partially Shared Multi-task Convolutional Neural Network with Local Constraint (PS-MCNN-LC), where PS structure and local constraint are integrated together to help the framework learn better attribute representations.
Abstract: In this paper, we study the face attribute learning problem by considering the identity information and attribute relationships simultaneously. In particular, we first introduce a Partially Shared Multi-task Convolutional Neural Network (PS-MCNN), in which four Task Specific Networks (TSNets) and one Shared Network (SNet) are connected by Partially Shared (PS) structures to learn better shared and task specific representations. To utilize identity information to further boost the performance, we introduce a local learning constraint which minimizes the difference between the representations of each sample and its local geometric neighbours with the same identity. Consequently, we present a local constraint regularized multitask network, called Partially Shared Multi-task Convolutional Neural Network with Local Constraint (PS-MCNN-LC), where PS structure and local constraint are integrated together to help the framework learn better attribute representations. The experimental results on CelebA and LFWA demonstrate the promise of the proposed methods.

Book ChapterDOI
20 Apr 2018
TL;DR: Relying heavily on concepts from Brunswik's and Gibson's ecological theories, ecological task analysis is proposed as a framework in which to predict the types of cognitive activity required to achieve productive behavior, and to suggest how interfaces can be manipulated to alleviate certain types of Cognitive demands.
Abstract: Cognitive engineering is largely concerned with creating environmental designs to support skillful and effective human activity. A set of necessary conditions are proposed for psychological models capable of supporting this enterprise. An analysis of the psychological nature of the design product is used to identify a set of constraints that models must meet if they can usefully guide design. It is concluded that cognitive engineering requires models with resources for describing the integrated human-environment system, and that these models must be capable of describing the activities underlying fluent and effective interaction. These features are required in order to be able to predict the cognitive activity that will be required given various design concepts, and to design systems that promote the acquisition of fluent, skilled behavior. These necessary conditions suggest that an ecological approach can provide valuable resources for psychological modeling to support design. Relying heavily on concepts from Brunswik's and Gibson's ecological theories, ecological task analysis is proposed as a framework in which to predict the types of cognitive activity required to achieve productive behavior, and to suggest how interfaces can be manipulated to alleviate certain types of cognitive demands. The framework is described in terms, and illustrated with an example from the previous research on modeling skilled human-environment interaction.

Journal ArticleDOI
TL;DR: This paper first presents the unique features of MCS allocation compared to generic crowdsourcing, and then provides a comprehensive review for diversifying problem formulation and allocation algorithms together with future research opportunities.
Abstract: Mobile crowd sensing (MCS) is the special case of crowdsourcing, which leverages the smartphones with various embedded sensors and user’s mobility to sense diverse phenomenon in a city. Task allocation is a fundamental research issue in MCS, which is crucial for the efficiency and effectiveness of MCS applications. In this paper, we specifically focus on the task allocation in MCS systems. We first present the unique features of MCS allocation compared to generic crowdsourcing, and then provide a comprehensive review for diversifying problem formulation and allocation algorithms together with future research opportunities.

Proceedings ArticleDOI
11 Jun 2018
TL;DR: The proposed Fog Following Me (Folo), a novel solution for latency and quality balanced task allocation in vehicular fog computing, is designed to support the mobility of vehicles, including ones generating tasks and the others serving as fog nodes.
Abstract: Emerging vehicular applications, such as real-time situational awareness and cooperative lane change, demand for sufficient computing resources at the edge to conduct time-critical and data-intensive tasks. This paper proposes Fog Following Me (Folo), a novel solution for latency and quality balanced task allocation in vehicular fog computing. Folo is designed to support the mobility of vehicles, including ones generating tasks and the others serving as fog nodes. We formulate the process of task allocation across stationary and mobile fog nodes into a joint optimization problem, with constraints on service latency, quality loss, and fog capacity. As it is a NP-hard problem, we linearize it and solve it using Mixed Integer Linear Programming. To evaluate the effectiveness of Folo, we simulate the mobility of fog nodes at different times of day based on real-world taxi traces, and implement two representative tasks, including video streaming and real-time object recognition. Compared with naive and random fog node selection, the latency and quality balanced task allocation provided by Folo achieves higher performance. More specifically, Folo shortens the average service latency by up to 41\% while reducing the quality loss by up to 60\%.

Journal ArticleDOI
TL;DR: A comprehensive survey of state-of-the-art task assignment mechanisms in mobile crowdsensing systems is presented and how each of the existing mechanisms works is introduced and their merits and deficiencies are discussed.
Abstract: Mobile crowdsensing has wide application perspectives and tremendous advantages over traditional sensor networks due to its low cost, extensive coverage, and high sensing accuracy properties. Task assignment is a crucial issue in mobile crowdsensing systems which is intended to achieve a good tradeoff between task quality and task cost. The design of efficient task assignment mechanisms has attracted a lot of attention and much work has been carried out. In this article, we present a comprehensive survey of state-of-the-art task assignment mechanisms in mobile crowdsensing systems. We will first introduce several fundamental issues in task assignment and classify existing mechanisms based on different design criteria. Then we introduce how each of the existing mechanisms works and discuss their merits and deficiencies. Finally, we discuss challenging issues and point out some future directions in this area.

Posted Content
TL;DR: The authors compare language modeling, translation, skip-thought, and autoencoding on syntactic and part-of-speech information, and find that language models consistently perform best on their syntactic auxiliary prediction tasks, even when trained on relatively small amounts of data.
Abstract: Recent work using auxiliary prediction task classifiers to investigate the properties of LSTM representations has begun to shed light on why pretrained representations, like ELMo (Peters et al., 2018) and CoVe (McCann et al., 2017), are so beneficial for neural language understanding models. We still, though, do not yet have a clear understanding of how the choice of pretraining objective affects the type of linguistic information that models learn. With this in mind, we compare four objectives---language modeling, translation, skip-thought, and autoencoding---on their ability to induce syntactic and part-of-speech information. We make a fair comparison between the tasks by holding constant the quantity and genre of the training data, as well as the LSTM architecture. We find that representations from language models consistently perform best on our syntactic auxiliary prediction tasks, even when trained on relatively small amounts of data. These results suggest that language modeling may be the best data-rich pretraining task for transfer learning applications requiring syntactic information. We also find that the representations from randomly-initialized, frozen LSTMs perform strikingly well on our syntactic auxiliary tasks, but this effect disappears when the amount of training data for the auxiliary tasks is reduced.

Journal ArticleDOI
TL;DR: A case study based on real-world third-party IaaS clouds and some well-known scientific workflows shows that the proposed approach outperforms traditional approaches, especially those considering time-invariant or bounded performance only.
Abstract: Cloud computing is becoming an increasingly popular platform for the execution of scientific applications such as scientific workflows. In contrast to grids and other traditional high-performance computing systems, clouds provide a customizable infrastructure where scientific workflows can provision desired resources ahead of the execution and set up a required software environment on virtual machines (VMs). Nevertheless, various challenges, especially its quality-of-service prediction and optimal scheduling, are yet to be addressed. Existing studies mainly consider workflow tasks to be executed with VMs having time-invariant, stochastic, or bounded performance and focus on minimizing workflow execution time or execution cost while meeting the quality-of-service requirements. This work considers time-varying performance and aims at minimizing the execution cost of workflow deployed on Infrastructure-as-a-Service clouds while satisfying Service-Level-Agreements with users. We employ time-series-based approaches to capture dynamic performance fluctuations, feed a genetic algorithm with predicted performance of VMs, and generate schedules at run-time. A case study based on real-world third-party IaaS clouds and some well-known scientific workflows show that our proposed approach outperforms traditional approaches, especially those considering time-invariant or bounded performance only.