scispace - formally typeset
Search or ask a question

Showing papers on "Task analysis published in 2019"


Journal ArticleDOI
TL;DR: Simulation results show that the proposed novel heuristic algorithm performs closely to the optimal solution and that it significantly improves the users’ offloading utility over traditional approaches.
Abstract: Mobile-edge computing (MEC) is an emerging paradigm that provides a capillary distribution of cloud computing capabilities to the edge of the wireless access network, enabling rich services and applications in close proximity to the end users. In this paper, an MEC enabled multi-cell wireless network is considered where each base station (BS) is equipped with a MEC server that assists mobile users in executing computation-intensive tasks via task offloading. The problem of joint task offloading and resource allocation is studied in order to maximize the users’ task offloading gains, which is measured by a weighted sum of reductions in task completion time and energy consumption. The considered problem is formulated as a mixed integer nonlinear program (MINLP) that involves jointly optimizing the task offloading decision, uplink transmission power of mobile users, and computing resource allocation at the MEC servers. Due to the combinatorial nature of this problem, solving for optimal solution is difficult and impractical for a large-scale network. To overcome this drawback, we propose to decompose the original problem into a resource allocation (RA) problem with fixed task offloading decision and a task offloading (TO) problem that optimizes the optimal-value function corresponding to the RA problem. We address the RA problem using convex and quasi-convex optimization techniques, and propose a novel heuristic algorithm to the TO problem that achieves a suboptimal solution in polynomial time. Simulation results show that our algorithm performs closely to the optimal solution and that it significantly improves the users’ offloading utility over traditional approaches.

705 citations


Proceedings ArticleDOI
20 May 2019
TL;DR: This work uses self-supervision to learn a compact and multimodal representation of sensory inputs, which can then be used to improve the sample efficiency of the policy learning of deep reinforcement learning algorithms.
Abstract: Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. However, it is non-trivial to manually design a robot controller that combines modalities with very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to deploy on real robots due to sample complexity. We use self-supervision to learn a compact and multimodal representation of our sensory inputs, which can then be used to improve the sample efficiency of our policy learning. We evaluate our method on a peg insertion task, generalizing over different geometry, configurations, and clearances, while being robust to external perturbations. We present results in simulation and on a real robot.

322 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed a new system design, where probabilistic and statistical constraints are imposed on task queue lengths, by applying extreme value theory to minimize users' power consumption while trading off the allocated resources for local computation and task offloading.
Abstract: To overcome devices’ limitations in performing computation-intense applications, mobile edge computing (MEC) enables users to offload tasks to proximal MEC servers for faster task computation. However, the current MEC system design is based on average-based metrics, which fails to account for the ultra-reliable low-latency requirements in mission-critical applications. To tackle this, this paper proposes a new system design, where probabilistic and statistical constraints are imposed on task queue lengths, by applying extreme value theory . The aim is to minimize users’ power consumption while trading off the allocated resources for local computation and task offloading. Due to wireless channel dynamics, users are reassociated to MEC servers in order to offload tasks using higher rates or accessing proximal servers. In this regard, a user–server association policy is proposed, taking into account the channel quality as well as the servers’ computation capabilities and workloads. By marrying tools from Lyapunov optimization and matching theory, a two-timescale mechanism is proposed, where a user–server association is solved in the long timescale, while a dynamic task offloading and resource allocation policy are executed in the short timescale. The simulation results corroborate the effectiveness of the proposed approach by guaranteeing highly reliable task computation and lower delay performance, compared to several baselines.

297 citations


Journal ArticleDOI
TL;DR: In this article, a model based on a multi-branch deep architecture was proposed to predict the driver's focus of attention while driving, which part of the scene around the vehicle is more critical for the task.
Abstract: In this work we aim to predict the driver's focus of attention. The goal is to estimate what a person would pay attention to while driving, and which part of the scene around the vehicle is more critical for the task. To this end we propose a new computer vision model based on a multi-branch deep architecture that integrates three sources of information: raw video, motion and scene semantics. We also introduce DR(eye)VE, the largest dataset of driving scenes for which eye-tracking annotations are available. This dataset features more than 500,000 registered frames, matching ego-centric views (from glasses worn by drivers) and car-centric views (from roof-mounted camera), further enriched by other sensors measurements. Results highlight that several attention patterns are shared across drivers and can be reproduced to some extent. The indication of which elements in the scene are likely to capture the driver's attention may benefit several applications in the context of human-vehicle interaction and driver attention analysis.

172 citations


Journal ArticleDOI
TL;DR: This paper proposes a personalized privacy-preserving task allocation framework for mobile crowdsensing that can allocate tasks effectively while providing personalized location privacy protection, and proposes a Vickrey Payment Determination Mechanism to determine the appropriate payment to each winner by considering its movement cost and privacy level.
Abstract: Location information of workers are usually required for optimal task allocation in mobile crowdsensing, which however raises severe concerns of location privacy leakage. Although many approaches have been proposed to protect the locations of users, the location protection for task allocation in mobile crowdsensing has not been well explored. In addition, to the best of our knowledge, none of existing privacy-preserving task allocation mechanisms can provide personalized location protection considering different protection demands of workers. In this paper, we propose a personalized privacy-preserving task allocation framework for mobile crowdsensing that can allocate tasks effectively while providing personalized location privacy protection. The basic idea is that each worker uploads the obfuscated distances and personal privacy level to the server instead of its true locations or distances to tasks. In particular, we propose a Probabilistic Winner Selection Mechanism (PWSM) to minimize the total travel distance with the obfuscated information from workers, by allocating each task to the worker who has the largest probability of being closest to it. Moreover, we propose a Vickrey Payment Determination Mechanism (VPDM) to determine the appropriate payment to each winner by considering its movement cost and privacy level, which satisfies the truthfulness, profitability, and probabilistic individual rationality. Extensive experiments on the real-world datasets demonstrate the effectiveness of the proposed mechanisms.

170 citations


Journal ArticleDOI
TL;DR: The results show that networks trained to regress to the ground truth targets for labeled data and to simultaneously learn to rank unlabeled data obtain significantly better, state-of-the-art results for both IQA and crowd counting.
Abstract: For many applications the collection of labeled data is expensive laborious. Exploitation of unlabeled data during training is thus a long pursued objective of machine learning. Self-supervised learning addresses this by positing an auxiliary task (different, but related to the supervised task) for which data is abundantly available. In this paper, we show how ranking can be used as a proxy task for some regression problems. As another contribution, we propose an efficient backpropagation technique for Siamese networks which prevents the redundant computation introduced by the multi-branch network architecture. We apply our framework to two regression problems: Image Quality Assessment (IQA) and Crowd Counting. For both we show how to automatically generate ranked image sets from unlabeled data. Our results show that networks trained to regress to the ground truth targets for labeled data and to simultaneously learn to rank unlabeled data obtain significantly better, state-of-the-art results for both IQA and crowd counting. In addition, we show that measuring network uncertainty on the self-supervised proxy task is a good measure of informativeness of unlabeled data. This can be used to drive an algorithm for active learning and we show that this reduces labeling effort by up to 50 percent.

161 citations


Journal ArticleDOI
TL;DR: This paper proposes Folo, a novel solution for latency and quality optimized task allocation in vehicular fog computing (VFC), and proposes an event-triggered dynamic task allocation framework using linear programming-based optimization and binary particle swarm optimization.
Abstract: With the emerging vehicular applications, such as real-time situational awareness and cooperative lane change, there exist huge demands for sufficient computing resources at the edge to conduct time-critical and data-intensive tasks. This paper proposes Folo, a novel solution for latency and quality optimized task allocation in vehicular fog computing (VFC). Folo is designed to support the mobility of vehicles, including vehicles that generate tasks and the others that serve as fog nodes. Considering constraints on service latency, quality loss, and fog capacity, the process of task allocation across stationary and mobile fog nodes is formulated into a joint optimization problem. This task allocation in VFC is known as a nondeterministic polynomial-time hard problem. In this paper, we present the task allocation to fog nodes as a bi-objective minimization problem, where a tradeoff is maintained between the service latency and quality loss. Specifically, we propose an event-triggered dynamic task allocation framework using linear programming-based optimization and binary particle swarm optimization. To assess the effectiveness of Folo, we simulated the mobility of fog nodes at different times of a day based on real-world taxi traces and implemented two representative tasks, including video streaming and real-time object recognition. Simulation results show that the task allocation provided by Folo can be adjusted according to actual requirements of the service latency and quality, and achieves higher performance compared with naive and random fog node selection. To be more specific, Folo shortens the average service latency by up to 27% while reducing the quality loss by up to 56%.

159 citations


Proceedings ArticleDOI
20 May 2019
TL;DR: It is shown that contact-rich manipulation behavior with multi-fingered hands can be learned by directly training with model-free deep RL algorithms in the real world, with minimal additional assumption and without the aid of simulation, indicating that direct deep RL training in thereal world is a viable and practical alternative to simulation and model-based control.
Abstract: Dexterous multi-fingered robotic hands can perform a wide range of manipulation skills, making them an appealing component for general-purpose robotic manipulators. However, such hands pose a major challenge for autonomous control, due to the high dimensionality of their configuration space and complex intermittent contact interactions. In this work, we propose deep reinforcement learning (deep RL) as a scalable solution for learning complex, contact rich behaviors with multi-fingered hands. Deep RL provides an end-to-end approach to directly map sensor readings to actions, without the need for task specific models or policy classes. We show that contact-rich manipulation behavior with multi-fingered hands can be learned by directly training with model-free deep RL algorithms in the real world, with minimal additional assumption and without the aid of simulation. We learn to perform a variety of tasks on two different low-cost hardware platforms entirely from scratch, and further study how the learning can be accelerated by using a small number of human demonstrations. Our experiments demonstrate that complex multi-fingered manipulation skills can be learned in the real world in about 4-7 hours for most tasks, and that demonstrations can decrease this to 2-3 hours, indicating that direct deep RL training in the real world is a viable and practical alternative to simulation and model-based control. https:// sites.google.com/view/deeprl-handmanipulation

153 citations


Proceedings ArticleDOI
Tao Ouyang1, Rui Li1, Xu Chen1, Zhi Zhou1, Xin Tang1 
01 Apr 2019
TL;DR: A novel adaptive user-managed service placement mechanism, which jointly optimizes a user’s perceived-latency and service migration cost, weighted by user preferences is proposed, which helps the user to make adaptive service placement decisions.
Abstract: Mobile Edge Computing (MEC), envisioned as a cloud extension, pushes cloud resource from the network core to the network edge, thereby meeting the stringent service requirements of many emerging computation-intensive mobile applications. Many existing works have focused on studying the system-wide MEC service placement issues, personalized service performance optimization yet receives much less attention. Thus, in this paper we propose a novel adaptive user-managed service placement mechanism, which jointly optimizes a user’s perceived-latency and service migration cost, weighted by user preferences. To overcome the unavailability of future information and unknown system dynamics, we formulate the dynamic service placement problem as a contextual Multi-armed Bandit (MAB) problem, and then propose a Thompson-sampling based online learning algorithm to explore the dynamic MEC environment, which further assists the user to make adaptive service placement decisions. Rigorous theoretical analysis and extensive evaluations demonstrate the superior performance of the proposed adaptive user-managed service placement mechanism.

146 citations


Proceedings ArticleDOI
01 Feb 2019

145 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: The ICDAR 2019 Challenge on "Scanned receipts OCR and key information extraction" (SROIE) covers important aspects related to the automated analysis of scanned receipts, and is considered to evolve into a useful resource for the community, drawing further attention and promoting research and development efforts in this field.
Abstract: The ICDAR 2019 Challenge on "Scanned receipts OCR and key information extraction" (SROIE) covers important aspects related to the automated analysis of scanned receipts. The SROIE tasks play a key role in many document analysis systems and hold significant commercial potential. Although a lot of work has been published over the years on administrative document analysis, the community has advanced relatively slowly, as most datasets have been kept private. One of the key contributions of SROIE to the document analysis community is to offer a first, standardized dataset of 1000 whole scanned receipt images and annotations, as well as an evaluation procedure for such tasks. The Challenge is structured around three tasks, namely Scanned Receipt Text Localization (Task 1), Scanned Receipt OCR (Task 2) and Key Information Extraction from Scanned Receipts (Task 3). The competition opened on 10th February, 2019 and closed on 5th May, 2019. We received 29, 24 and 18 valid submissions received for the three competition tasks, respectively. This report presents the competition datasets, define the tasks and the evaluation protocols, offer detailed submission statistics, as well as an analysis of the submitted performance. While the tasks of text localization and recognition seem to be relatively easy to tackle, it is interesting to observe the variety of ideas and approaches proposed for the information extraction task. According to the submissions' performance we believe there is still margin for improving information extraction performance, although the current dataset would have to grow substantially in following editions. Given the success of the SROIE competition evidenced by the wide interest generated and the healthy number of submissions from academic, research institutes and industry over different countries, we consider that the SROIE competition can evolve into a useful resource for the community, drawing further attention and promoting research and development efforts in this field.

Proceedings ArticleDOI
10 Feb 2019
TL;DR: A method to generate vectorial representations of visual classification tasks which can be used to reason about the nature of those tasks and their relations, and is demonstrated to be capable of predicting task similarities that match the authors' intuition about semantic and taxonomic relations between different visual tasks.
Abstract: We introduce a method to generate vectorial representations of visual classification tasks which can be used to reason about the nature of those tasks and their relations. Given a dataset with ground-truth labels and a loss function, we process images through a "probe network" and compute an embedding based on estimates of the Fisher information matrix associated with the probe network parameters. This provides a fixed-dimensional embedding of the task that is independent of details such as the number of classes and requires no understanding of the class label semantics. We demonstrate that this embedding is capable of predicting task similarities that match our intuition about semantic and taxonomic relations between different visual tasks. We demonstrate the practical value of this framework for the meta-task of selecting a pre-trained feature extractor for a novel task. We present a simple meta-learning framework for learning a metric on embeddings that is capable of predicting which feature extractors will perform well on which task. Selecting a feature extractor with task embedding yields performance close to the best available feature extractor, with substantially less computational effort than exhaustively training and evaluating all available models.

Proceedings ArticleDOI
Yiru Wang1, Weihao Gan1, Jie Yang1, Wei Wu1, Junjie Yan1 
01 Oct 2019
TL;DR: In this paper, a unified framework called Dynamic Curriculum Learning (DCL) is proposed to adaptively adjust the sampling strategy and loss weight in each batch, which results in better ability of generalization and discrimination.
Abstract: Human attribute analysis is a challenging task in the field of computer vision. One of the significant difficulties is brought from largely imbalance-distributed data. Conventional techniques such as re-sampling and cost-sensitive learning require prior-knowledge to train the system. To address this problem, we propose a unified framework called Dynamic Curriculum Learning (DCL) to adaptively adjust the sampling strategy and loss weight in each batch, which results in better ability of generalization and discrimination. Inspired by curriculum learning, DCL consists of two-level curriculum schedulers: (1) sampling scheduler which manages the data distribution not only from imbalance to balance but also from easy to hard; (2) loss scheduler which controls the learning importance between classification and metric learning loss. With these two schedulers, we achieve state-of-the-art performance on the widely used face attribute dataset CelebA and pedestrian attribute dataset RAP.

Journal ArticleDOI
TL;DR: This paper studies a novel device-to-device (D2D)-enabled multi-helper MEC system, in which a local user solicits its nearby WDs serving as helpers for cooperative computation and proposes an efficient algorithm by first relaxing the original problem into a convex one, and then constructing a suboptimal task assignment solution based on the obtained optimal one.
Abstract: With the proliferation of computation-extensive and latency-critical applications in the 5G and beyond networks, mobile-edge computing (MEC) or fog computing, which provides cloud-like computation and/or storage capabilities at the network edge, is envisioned to reduce computation latency as well as to conserve energy for wireless devices (WDs). This paper studies a novel device-to-device (D2D)-enabled multi-helper MEC system, in which a local user solicits its nearby WDs serving as helpers for cooperative computation. We assume a time division multiple access (TDMA) transmission protocol, under which the local user offloads the tasks to multiple helpers and downloads the results from them over orthogonal pre-scheduled time slots. Under this setup, we minimize the computation latency by optimizing the local user’s task assignment jointly with the time and rate for task offloading and results downloading, as well as the computation frequency for task execution, subject to individual energy and computation capacity constraints at the local user and the helpers. However, the formulated problem is a mixed-integer non-linear program (MINLP) that is difficult to solve. To tackle this challenge, we propose an efficient algorithm by first relaxing the original problem into a convex one, and then constructing a suboptimal task assignment solution based on the obtained optimal one. Furthermore, we consider a benchmark scheme that endows the WDs with their maximum computation capacities. To further reduce the implementation complexity, we also develop a heuristic scheme based on the greedy task assignment. Finally, the numerical results validate the effectiveness of our proposed algorithm, as compared against the heuristic scheme and other benchmark ones without either joint optimization of radio and computation resources or task assignment design.

Journal ArticleDOI
TL;DR: A new framework to facilitate robot skill generalization is proposed, in that the learned skills are first segmented into a sequence of subskills automatically, then each individual subskill is encoded and regulated accordingly.
Abstract: Robots are often required to generalize the skills learned from human demonstrations to fulfil new task requirements. However, skill generalization will be difficult to realize when facing with the following situations: the skill for a complex multistep task includes a number of features; some special constraints are imposed on the robots during the process of task reproduction; and a completely new situation quite different with the one in which demonstrations are given to the robot. This work proposes a new framework to facilitate robot skill generalization. The basic idea lies in that the learned skills are first segmented into a sequence of subskills automatically, then each individual subskill is encoded and regulated accordingly. Specifically, we adapt each set of the segmented movement trajectories individually instead of the whole movement profiles, thus, making it more convenient for the realization of skill generalization. In addition, human limb stiffness estimated from surface electromyographic signals is considered in the framework for the realization of human-to-robot variable impedance control skill transfer, as well as the generalization of both movement trajectories and stiffness profiles. Experimental study has been performed to verify the effectiveness of the proposed framework.

Journal ArticleDOI
TL;DR: A convex–concave-procedure-based contract optimization algorithm for server recruitment and a matching-learning-based task offloading mechanism, which takes both occurrence awareness and conflict awareness into consideration, are proposed.
Abstract: Vehicular fog computing has emerged as a cost-efficient solution for task processing in vehicular networks. However, how to realize effective server recruitment and reliable task offloading under information asymmetry and uncertainty remains a critical challenge. In this paper, we adopt a two-stage task offloading framework to address this challenge. First, we propose a convex–concave-procedure-based contract optimization algorithm for server recruitment, which aims to maximize the expected utility of the operator with asymmetric information. Then, a low-complexity and stable task offloading mechanism is proposed to minimize the total network delay based on the pricing-based matching. Furthermore, we extend the work to the scenario of information uncertainty and develop a matching-learning-based task offloading mechanism, which takes both occurrence awareness and conflict awareness into consideration. Simulation results demonstrate that the proposed algorithm can effectively motivate resource sharing and guarantee bounded deviation from the optimal performance without the global information.

Proceedings ArticleDOI
01 Oct 2019
TL;DR: This survey reviews the concepts and definitions related to transfer learning and lists the different terms used in the literature and brings the point of view from different authors of prior surveys to give a clear vision of directions for future work in this field of research.
Abstract: Transfer learning is an emerging topic that may drive the success of machine learning in research and industry. The lack of data on specific tasks is one of the main reasons to use it, since collecting and labeling data can be very expensive and can take time, and recent concerns with privacy make difficult to use real data from users. The use of transfer learning helps to fast prototype new machine learning models using pre-trained models from a source task since training on millions of images can take time and requires expensive GPUs. In this survey, we review the concepts and definitions related to transfer learning and we list the different terms used in the literature. We bring the point of view from different authors of prior surveys, adding some more recent findings in order to give a clear vision of directions for future work in this field of research.

Proceedings ArticleDOI
08 Apr 2019
TL;DR: In this paper, the authors proposed Neural Multi-Task Recommendation (NMTR) for learning recommender systems from user multi-behavior data, which accounts for the cascading relationship among different types of behaviors and performs a joint optimization based on the multi-task learning framework.
Abstract: Most existing recommender systems leverage user behavior data of one type, such as the purchase behavior data in E-commerce. We argue that other types of user behavior data also provide valuable signal, such as views, clicks, and so on. In this work, we contribute a new solution named NMTR (short for Neural Multi-Task Recommendation) for learning recommender systems from user multi-behavior data. In particular, our model accounts for the cascading relationship among different types of behaviors (e.g., a user must click on a product before purchasing it). We perform a joint optimization based on the multi-task learning framework, where the optimization on a behavior is treated as a task. Extensive experiments on the real-world dataset demonstrate that NMTR significantly outperforms state-of-the-art recommender systems that are designed to learn from both single-behavior data and multi-behavior data.

Journal ArticleDOI
TL;DR: In this article, the authors report results from a crowdsourced experiment to evaluate the effectiveness of five small scale (5-34 data points) two-dimensional visualization types (Table, Line Chart, Bar Chart, Scatterplot, and Pie Chart) across ten common data analysis tasks using two datasets.
Abstract: Visualizations of tabular data are widely used; understanding their effectiveness in different task and data contexts is fundamental to scaling their impact. However, little is known about how basic tabular data visualizations perform across varying data analysis tasks. In this paper, we report results from a crowdsourced experiment to evaluate the effectiveness of five small scale (5-34 data points) two-dimensional visualization types—Table, Line Chart, Bar Chart, Scatterplot, and Pie Chart—across ten common data analysis tasks using two datasets. We find the effectiveness of these visualization types significantly varies across task, suggesting that visualization design would benefit from considering context-dependent effectiveness. Based on our findings, we derive recommendations on which visualizations to choose based on different tasks. We finally train a decision tree on the data we collected to drive a recommender, showcasing how to effectively engineer experimental user data into practical visualization systems.

Journal ArticleDOI
Weitian Wang1, Rui Li1, Yi Chen1, Z. Max Diekel1, Yunyi Jia1 
TL;DR: A TLC model for the collaborative robot to learn from human demonstrations and assist its human partner in collaborative tasks and the advantages of the proposed approach are demonstrated via a set of experiments in realistic human–robot collaboration contexts.
Abstract: Collaborative robots are widely employed in strict hybrid assembly tasks involved in intelligent manufacturing. In this paper, we develop a teaching-learning-collaboration (TLC) model for the collaborative robot to learn from human demonstrations and assist its human partner in shared working situations. The human could program the robot using natural language instructions according to his/her personal working preferences via this approach. Afterward, the robot learns from human assembly demonstrations by taking advantage of the maximum entropy inverse reinforcement learning algorithm and updates its task-based knowledge using the optimal assembly strategy. In the collaboration process, the robot is able to leverage its learned knowledge to actively assist the human in the collaborative assembly task. Experimental results and analysis demonstrate that the proposed approach presents considerable robustness and applicability in human–robot collaborative tasks. Note to Practitioners —This paper is motivated by the human–robot collaborative assembly problem in the context of advanced manufacturing. Collaborative robotics makes a huge shift from the traditional robot-in-a-cage model to robots interacting with people in an open working environment. When the human works with the robot in the shared workspace, it is significant to lessen human programming effort and improve the human–robot collaboration efficiency once the task is updated. We develop a TLC model for the robot to learn from human demonstrations and assist its human partner in collaborative tasks. Once the task is changed, the human may code the robot via natural language instructions according to his/her personal working preferences. The robot can learn from human assembly demonstrations to update its task-based knowledge, which can be leveraged by the robot to actively assist the human to accomplish the collaborative task. We demonstrate the advantages of the proposed approach via a set of experiments in realistic human–robot collaboration contexts.

Journal ArticleDOI
Rod Ellis1
TL;DR: Task-based language teaching and task-supported language teaching are often seen as incompatible as they draw on different theories of language learning and language teaching as mentioned in this paper. But task-based TBLT and TSLT are not incompatible.
Abstract: Task-based language teaching (TBLT) and task-supported language teaching (TSLT) are often seen as incompatible as they draw on different theories of language learning and language teaching. The pos...

Proceedings ArticleDOI
20 May 2019
TL;DR: This work presents a method for transferring a vision-based lane following driving policy from simulation to operation on a rural road without any real-world labels and assesses the driving performance using both open-loop regression metrics, and closed-loop performance operating an autonomous vehicle on rural and urban roads.
Abstract: Simulation can be a powerful tool for under-standing machine learning systems and designing methods to solve real-world problems. Training and evaluating methods purely in simulation is often “doomed to succeed” at the desired task in a simulated environment, but the resulting models are incapable of operation in the real world. Here we present and evaluate a method for transferring a vision-based lane following driving policy from simulation to operation on a rural road without any real-world labels. Our approach leverages recent advances in image-to-image translation to achieve domain transfer while jointly learning a single-camera control policy from simulation control labels. We assess the driving performance of this method using both open-loop regression metrics, and closed-loop performance operating an autonomous vehicle on rural and urban roads.

Journal ArticleDOI
TL;DR: The authors investigated the cognitive processes underlying pauses at different textual locations and various levels of revision (e.g., below word/clause) and found that when participants paused at larger textual units, they were more likely to look back in the text and engage in higher-order writing processes.
Abstract: This study investigated the cognitive processes underlying pauses at different textual locations (e.g., within/between words) and various levels of revision (e.g., below word/clause). We used stimulated recall, keystroke logging, and eye-tracking methodology in combination to examine pausing and revision behaviors. Thirty advanced Chinese L2 users of English performed a version of the IELTS Academic Writing Task 2. During the writing task, participants’ key strokes were logged, and their eye movements were recorded. Immediately after the writing task, 12 participants also took part in a stimulated recall interview. The results revealed that, when participants paused at larger textual units, they were more likely to look back in the text and engage in higher-order writing processes. In contrast, during pauses at lower textual units, they tended to view areas closer to the inscription point and engage in lower-order writing processes. Prior to making a revision, participants most frequently had viewed the text that they subsequently revised or their eye gazes had been off-screen. Revisions focused more on language- than content-related issues, but there was a smaller difference in the number of language- and content-focused stimulated recall comments when larger textual units were revised.

Journal ArticleDOI
TL;DR: In this article, a cognitive task analysis (CTA) of the differentiation skill was performed and the resulting differentiation skill hierarchy was presented together with the knowledge required for differentiation, and the factors influencing its complexity.
Abstract: Providing differentiated instruction (DI) is considered an important but complex teaching skill which many teachers have not mastered and feel unprepared for. In order to design professional development activities, a thorough description of DI is required. The international literature on assessing teachers’ differentiation qualities describes the use of various instruments, ranging from self-reports to observation schemes and from perceived-difficulty instruments to student questionnaires. We question whether these instruments truly capture the complexity of differentiation. In order to depict this complexity, a cognitive task analysis (CTA) of the differentiation skill was performed. The resulting differentiation skill hierarchy is presented here, together with the knowledge required for differentiation, and the factors influencing its complexity. Based on the insights of this CTA, professional development trajectories can be designed and a comprehensive assessment instrument can be developed, en...

Journal ArticleDOI
TL;DR: A novel deep neural networks-based multi-task learning approach for NR-IQA to improve the representation ability and generalization ability and achieves significant improvement up to 21.8% on unseen distortion types.
Abstract: No-reference image quality assessment (NR-IQA) is a non-trivial task, because it is hard to find a pristine counterpart for an image in real applications, such as image selection, high quality image recommendation, etc. In recent years, deep learning-based NR-IQA methods emerged and achieved better performance than previous methods. In this paper, we present a novel deep neural networks-based multi-task learning approach for NR-IQA. Our proposed network is designed by a multi-task learning manner that consists of two tasks, namely, natural scene statistics (NSS) features prediction task and the quality score prediction task. NSS features prediction is an auxiliary task, which helps the quality score prediction task to learn better mapping between the input image and its quality score. The main contribution of this work is to integrate the NSS features prediction task to the deep learning-based image quality prediction task to improve the representation ability and generalization ability. To the best of our knowledge, it is the first attempt. We conduct the same database validation and cross database validation experiments on LIVE 1 , TID2013 2 , CSIQ 3 , LIVE multiply distorted image quality database (LIVE MD) 4 , CID2013 5 , and LIVE in the wild image quality challenge (LIVE challenge) 6 databases to verify the superiority and generalization ability of the proposed method. Experimental results confirm the superior performance of our method on the same database validation; our method especially achieves 0.984 and 0.986 on the LIVE image quality assessment database in terms of the Pearson linear correlation coefficient (PLCC) and Spearman rank-order correlation coefficient (SROCC), respectively. Also, experimental results from cross database validation verify the strong generalization ability of our method. Specifically, our method gains significant improvement up to 21.8% on unseen distortion types.

Journal ArticleDOI
21 May 2019
TL;DR: This work proposes a hybrid feedback control strategy using time-varying control barrier functions that finds least violating solutions in the aforementioned conflicting situations based on a suitable robustness notion and by initiating collaboration among agents.
Abstract: Motivated by the recent interest in cyber-physical and interconnected autonomous systems, we study the problem of dynamically coupled multi-agent systems under conflicting local signal temporal logic (STL) tasks. Each agent is assigned a local STL task regardless of the tasks that the other agents are assigned to. Such a task may be dependent, i.e., the satisfaction of the task may depend on the behavior of more than one agent, so that the satisfaction of the conjunction of all local tasks may be conflicting. We propose a hybrid feedback control strategy using time-varying control barrier functions. Our control strategy finds least violating solutions in the aforementioned conflicting situations based on a suitable robustness notion and by initiating collaboration among agents.

Proceedings ArticleDOI
20 May 2019
TL;DR: In this paper, the authors show that relatively minor modifications to an off-the-shelf Deep-RL algorithm, combined with a small number of human demonstrations, allows the robot to quickly learn to solve these tasks efficiently and robustly.
Abstract: Insertion is a challenging haptic and visual control problem with significant practical value for manufacturing. Existing approaches in the model-based robotics community can be highly effective when task geometry is known, but are complex and cumbersome to implement, and must be tailored to each individual problem by a qualified engineer. Within the learning community there is a long history of insertion research, but existing approaches are either too sample-inefficient to run on real robots, or assume access to high-level object features, e.g. socket pose. In this paper we show that relatively minor modifications to an off-the-shelf Deep-RL algorithm (DDPG), combined with a small number of human demonstrations, allows the robot to quickly learn to solve these tasks efficiently and robustly. Our approach requires no modeling or simulation, no parameterized search or alignment behaviors, no vision system aside from raw images, and no reward shaping. We evaluate our approach on a narrow-clearance peg-insertion task and a deformable clip-insertion task, both of which include variability in the socket position. Our results show that these tasks can be solved reliably on the real robot in less than 10 minutes of interaction time, and that the resulting policies are robust to variance in the socket position and orientation.

Journal ArticleDOI
TL;DR: This paper first formulate the task assignment and path planning problem in mobile crowdsensing mathematically when all task and user arrival information is known a priori and prove it to be NP-hard, and design four online task assignment algorithms, including quality/progress-based algorithm (QPA), task-density- based algorithm (TDA), travel-distance-balance-based algorithms (DBA), and bio-inspired travel- distance- balance-basedgorithm (B-DBA).
Abstract: Mobile crowdsensing has been a promising and cost-effective sensing paradigm for the Internet of Things by exploiting the proliferation and prevalence of sensor-rich mobile devices held/worn by mobile users. In this paper, we study the task assignment and path planning problem in mobile crowdsensing, which aims to maximize total task quality under constraints of user travel distance budgets. We first formulate the problem mathematically when all task and user arrival information is known a priori and prove it to be NP-hard. Then, we focus on studying the scenarios where users and tasks arrive dynamically and accordingly design four online task assignment algorithms, including quality/progress-based algorithm (QPA), task-density-based algorithm (TDA), travel-distance-balance-based algorithm (DBA), and bio-inspired travel-distance-balance-based algorithm (B-DBA). All the four algorithms work online for task assignment upon arrival of a new user. The former three algorithms work in a greedy manner for assigning tasks, one task each time, where the QPA prefers the task leading to largest ratio of task quality increment to travel cost, the TDA tends to guide user to high-task-density areas, and the DBA further considers travel distance balance information. The last algorithm B-DBA integrates the travel-distance-balance-aware metric in the DBA and bio-inspired search for further improved task assignment performance. Complexities of the proposed algorithms are deduced. Simulation results validate the effectiveness of our algorithms; B-DBA has the best performance among the four algorithms in terms of task quality, and furthermore, it outperforms existing work in this area.

Journal ArticleDOI
TL;DR: This study introduces a method to design a curriculum for machine-learning to maximize the efficiency during the training process of deep neural networks (DNNs) for speech emotion recognition and proposes metrics that quantify the inter-evaluation agreement to define the curriculum for regression problems and binary and multi-class classification problems.
Abstract: This study introduces a method to design a curriculum for machine-learning to maximize the efficiency during the training process of deep neural networks (DNNs) for speech emotion recognition. Previous studies in other machine-learning problems have shown the benefits of training a classifier following a curriculum where samples are gradually presented in increasing level of difficulty. For speech emotion recognition, the challenge is to establish a natural order of difficulty in the training set to create the curriculum. We address this problem by assuming that, ambiguous samples for humans are also ambiguous for computers. Speech samples are often annotated by multiple evaluators to account for differences in emotion perception across individuals. While some sentences with clear emotional content are consistently annotated, sentences with more ambiguous emotional content present important disagreement between individual evaluations. We propose to use the disagreement between evaluators as a measure of difficulty for the classification task. We propose metrics that quantify the inter-evaluation agreement to define the curriculum for regression problems and binary and multi-class classification problems. The experimental results consistently show that relying on a curriculum based on agreement between human judgments leads to statistically significant improvements over baselines trained without a curriculum.

Proceedings ArticleDOI
17 Mar 2019
TL;DR: This paper proposes human-guided machine learning (HGML) as a hybrid approach where a user interacts with an AutoML system and tasks it to explore different problem settings that reflect the user's knowledge about the data available.
Abstract: Automated Machine Learning (AutoML) systems are emerging that automatically search for possible solutions from a large space of possible kinds of models. Although fully automated machine learning is appropriate for many applications, users often have knowledge that supplements and constraints the available data and solutions. This paper proposes human-guided machine learning (HGML) as a hybrid approach where a user interacts with an AutoML system and tasks it to explore different problem settings that reflect the user's knowledge about the data available. We present: 1) a task analysis of HGML that shows the tasks that a user would want to carry out, 2) a characterization of two scientific publications, one in neuroscience and one in political science, in terms of how the authors would search for solutions using an AutoML system, 3) requirements for HGML based on those characterizations, and 4) an assessment of existing AutoML systems in terms of those requirements.