scispace - formally typeset
Search or ask a question

Showing papers on "Task (computing) published in 2014"


Proceedings ArticleDOI
17 Aug 2014
TL;DR: An adaptive measurement framework, called DREAM, that dynamically adjusts the resources devoted to each measurement task, while ensuring a user-specified level of accuracy, is described.
Abstract: Software-defined networks can enable a variety of concurrent, dynamically instantiated, measurement tasks, that provide fine-grain visibility into network traffic. Recently, there have been many proposals to configure TCAM counters in hardware switches to monitor traffic. However, the TCAM memory at switches is fundamentally limited and the accuracy of the measurement tasks is a function of the resources devoted to them on each switch. This paper describes an adaptive measurement framework, called DREAM, that dynamically adjusts the resources devoted to each measurement task, while ensuring a user-specified level of accuracy. Since the trade-off between resource usage and accuracy can depend upon the type of tasks, their parameters, and traffic characteristics, DREAM does not assume an a priori characterization of this trade-off, but instead dynamically searches for a resource allocation that is sufficient to achieve a desired level of accuracy. A prototype implementation and simulations with three network-wide measurement tasks (heavy hitter, hierarchical heavy hitter and change detection) and diverse traffic show that DREAM can support more concurrent tasks with higher accuracy than several other alternatives.

166 citations


Proceedings ArticleDOI
03 Jul 2014
TL;DR: This study characterize and compare user behavior in relatively long search sessions for search tasks of four different types, noting that users shift their interests to focus less on the top results but more on results ranked at lower positions in browsing and that results eventually become less and less attractive for the users.
Abstract: There are many existing studies of user behavior in simple tasks (e.g., navigational and informational search) within a short duration of 1--2 queries. However, we know relatively little about user behavior, especially browsing and clicking behavior, for longer search session solving complex search tasks. In this paper, we characterize and compare user behavior in relatively long search sessions (10 minutes; about 5 queries) for search tasks of four different types. The tasks differ in two dimensions: (1) the user is locating facts or is pursuing intellectual understanding of a topic; (2) the user has a specific task goal or has an ill-defined and undeveloped goal. We analyze how search behavior as well as browsing and clicking patterns change during a search session in these different tasks. Our results indicate that user behavior in the four types of tasks differ in various aspects, including search activeness, browsing style, clicking strategy, and query reformulation. As a search session progresses, we note that users shift their interests to focus less on the top results but more on results ranked at lower positions in browsing. We also found that results eventually become less and less attractive for the users. The reasons vary and include downgraded search performance of query, decreased novelty of search results, and decaying persistence of users in browsing. Our study highlights the lack of long session support in existing search engines and suggests different strategies of supporting longer sessions according to different task types.

123 citations


Posted Content
TL;DR: In this article, the authors propose curriculum learning of tasks, i.e., finding the best order of tasks to be learned, based on a generalization bound criterion for choosing the task order that optimizes the average expected classification performance over all tasks.
Abstract: Sharing information between multiple tasks enables algorithms to achieve good generalization performance even from small amounts of training data. However, in a realistic scenario of multi-task learning not all tasks are equally related to each other, hence it could be advantageous to transfer information only between the most related tasks. In this work we propose an approach that processes multiple tasks in a sequence with sharing between subsequent tasks instead of solving all tasks jointly. Subsequently, we address the question of curriculum learning of tasks, i.e. finding the best order of tasks to be learned. Our approach is based on a generalization bound criterion for choosing the task order that optimizes the average expected classification performance over all tasks. Our experimental results show that learning multiple related tasks sequentially can be more effective than learning them jointly, the order in which tasks are being solved affects the overall performance, and that our model is able to automatically discover the favourable order of tasks.

119 citations


Patent
30 Dec 2014
TL;DR: In this article, the authors describe a technique for discovering capabilities of voice-enabled resources using a digital personal assistant that can respond to user requests to list available voiceenabled resources that are capable of performing a specific task using voice input.
Abstract: Techniques are described for discovering capabilities of voice-enabled resources. A voice-controlled digital personal assistant can respond to user requests to list available voice-enabled resources that are capable of performing a specific task using voice input. The voice-controlled digital personal assistant can also respond to user requests to list the tasks that a particular voice-enabled resource can perform using voice input. The voice-controlled digital personal assistant can also support a practice mode in which users practice voice commands for performing tasks supported by voice-enabled resources.

100 citations


Journal ArticleDOI
TL;DR: A self-organized method for allocating the individuals of a robot swarm to tasks that are sequentially interdependent, which allows a swarm to reach a near-optimal allocation in the studied environments, can be transferred to a real robot setting, and is adaptive to changes in the properties of the tasks such as their duration.
Abstract: In this article we present a self-organized method for allocating the individuals of a robot swarm to tasks that are sequentially interdependent. Tasks that are sequentially interdependent are common in natural and artificial systems. The proposed method does neither rely on global knowledge nor centralized components. Moreover, it does not require the robots to communicate. The method is based on the delay experienced by the robots working on one subtask when waiting for input from another subtask. We explore the capabilities of the method in different simulated environments. Additionally, we evaluate the method in a proof-of-concept experiment using real robots. We show that the method allows a swarm to reach a near-optimal allocation in the studied environments, can easily be transferred to a real robot setting, and is adaptive to changes in the properties of the tasks such as their duration. Finally, we show that the ideal setting of the parameters of the method does not depend on the properties of the environment.

94 citations


Journal ArticleDOI
TL;DR: A new MIP model is presented, a novel heuristic algorithm based on beam search is proposed, as well as a task-oriented branch-and-bound procedure which uses new reduction rules and lower bounds for solving the ALWABP problem.

91 citations


Proceedings ArticleDOI
26 Apr 2014
TL;DR: This paper combines prior results from perceptual science and graphical perception to suggest a set of design variables that influence performance on various aggregate comparison tasks, and describes how choices in these variables can lead to designs that are matched to particular tasks.
Abstract: Many visualization tasks require the viewer to make judgments about aggregate properties of data. Recent work has shown that viewers can perform such tasks effectively, for example to efficiently compare the maximums or means over ranges of data. However, this work also shows that such effectiveness depends on the designs of the displays. In this paper, we explore this relationship between aggregation task and visualization design to provide guidance on matching tasks with designs. We combine prior results from perceptual science and graphical perception to suggest a set of design variables that influence performance on various aggregate comparison tasks. We describe how choices in these variables can lead to designs that are matched to particular tasks. We use these variables to assess a set of eight different designs, predicting how they will support a set of six aggregate time series comparison tasks. A crowd-sourced evaluation confirms these predictions. These results not only provide evidence for how the specific visualizations support various tasks, but also suggest using the identified design variables as a tool for designing visualizations well suited for various types of tasks.

89 citations


Proceedings Article
08 Dec 2014
TL;DR: This paper develops two multi-task extensions of the fitted Q-iteration algorithm that assume that the tasks are jointly sparse in the given representation and learns a transformation of the features in the attempt of finding a more sparse representation.
Abstract: In multi-task reinforcement learning (MTRL), the objective is to simultaneously learn multiple tasks and exploit their similarity to improve the performance w.r.t. single-task learning. In this paper we investigate the case when all the tasks can be accurately represented in a linear approximation space using the same small subset of the original (large) set of features. This is equivalent to assuming that the weight vectors of the task value functions are jointly sparse, i.e., the set of their non-zero components is small and it is shared across tasks. Building on existing results in multi-task regression, we develop two multi-task extensions of the fitted Q-iteration algorithm. While the first algorithm assumes that the tasks are jointly sparse in the given representation, the second one learns a transformation of the features in the attempt of finding a more sparse representation. For both algorithms we provide a sample complexity analysis and numerical simulations.

73 citations


Proceedings ArticleDOI
19 May 2014
TL;DR: The goal is to assign tasks to cores so that interdependent tasks are performed by "nearby" cores, thus lowering the distance messages must travel, the amount of congestion in the network, and the overall cost of communication.
Abstract: We present a new method for mapping applications' MPI tasks to cores of a parallel computer such that communication and execution time are reduced. We consider the case of sparse node allocation within a parallel machine, where the nodes assigned to a job are not necessarily located within a contiguous block nor within close proximity to each other in the network. The goal is to assign tasks to cores so that interdependent tasks are performed by "nearby" cores, thus lowering the distance messages must travel, the amount of congestion in the network, and the overall cost of communication. Our new method applies a geometric partitioning algorithm to both the tasks and the processors, and assigns task parts to the corresponding processor parts. We show that, for the structured finite difference mini-app Mini Ghost, our mapping method reduced execution time 34% on average on 65,536 cores of a Cray XE6. In a molecular dynamics mini-app, Mini MD, our mapping method reduced communication time by 26% on average on 6144 cores. We also compare our mapping with graph-based mappings from the LibTopoMap library and show that our mappings reduced the communication time on average by 15% in MiniGhost and 10% in MiniMD.

68 citations


Proceedings ArticleDOI
09 Jun 2014
TL;DR: Varuna as mentioned in this paper is a system that dynamically, continuously, rapidly and transparently adapts a program's parallelism to best match the instantaneous capabilities of the hardware resources while satisfying different efficiency metrics.
Abstract: Future multicore processors will be heterogeneous, be increasingly less reliable, and operate in dynamically changing operating conditions. Such environments will result in a constantly varying pool of hardware resources which can greatly complicate the task of efficiently exposing a program's parallelism onto these resources. Coupled with this uncertainty is the diverse set of efficiency metrics that users may desire. This paper proposes Varuna, a system that dynamically, continuously, rapidly and transparently adapts a program's parallelism to best match the instantaneous capabilities of the hardware resources while satisfying different efficiency metrics. Varuna is applicable to both multithreaded and task-based programs and can be seamlessly inserted between the program and the operating system without needing to change the source code of either. We demonstrate Varuna's effectiveness in diverse execution environments using unaltered C/C++ parallel programs from various benchmark suites. Regardless of the execution environment, Varuna always outperformed the state-of-the-art approaches for the efficiency metrics considered.

62 citations


Patent
31 Dec 2014
TL;DR: In this paper, a vehicle interface system includes a graphics processing unit and a plurality of processing domains, and a task scheduler configured to receive the tasks generated by the processing domains and to determine an order in which to send the tasks to the GPU.
Abstract: A vehicle interface system includes a graphics processing unit and a plurality of processing domains. The processing domains execute vehicle applications and generate tasks for the graphics processing unit. The system further includes a task scheduler configured to receive the tasks generated by the processing domains and to determine an order in which to send the tasks to the graphics processing unit. The graphics processing unit processes the tasks in the order determined by the task scheduler and generates display data based on the tasks. The system further includes an electronic display configured to receive the display data generated by the graphics processing unit and to present the display data to a user.

Proceedings ArticleDOI
29 Sep 2014
TL;DR: It is shown that, compared to the state-of-the-art solution, the proposed schedulability test derived from the refined DBFs can accommodate smaller periods and thus achieve better service guarantees for low-critical tasks.
Abstract: Most mixed-criticality scheduling algorithms have the problem of service interruption for low-critical tasks, which has prompted several recent studies on providing various service guarantees for such tasks. In this paper, focusing on dual-criticality systems, we explore the best achievable service guarantees for low-critical tasks in different running modes and investigate their trade-offs. Specifically, the Elastic MixedCriticality (E-MC) task model is first extended to allow each lowcritical task to have a pair of small and large periods, which represent its service guarantees in the low and high running modes, respectively. To improve system schedulability under a mode-switch EDF scheduler, virtual deadlines for high-critical tasks are also incorporated. Then, we develop new demand bound functions (DBFs) following a unified approach and analyze the corresponding schedulability conditions. The service guarantees for low-critical tasks are explored via the adjustment of their paired periods. We show that, compared to the state-of-the-art solution, the proposed schedulability test derived from the refined DBFs can accommodate smaller periods and thus achieve better service guarantees for low-critical tasks. Moreover, there are some interesting trade-offs between the service guarantees and a few guidelines are attained for properly specifying them.

Proceedings ArticleDOI
19 May 2014
TL;DR: The analysis highlights that these generic task-based runtimes achieve comparable results to the application-optimized embedded scheduler on homogeneous platforms and are able to significantly speed up the solver on heterogeneous environments by taking advantage of the accelerators while hiding the complexity of their efficient manipulation from the programmer.
Abstract: The ongoing hardware evolution exhibits an escalation in the number, as well as in the heterogeneity, of computing resources. The pressure to maintain reasonable levels of performance and portability forces application developers to leave the traditional programming paradigms and explore alternative solutions. PaStiX is a parallel sparse direct solver, based on a dynamic scheduler for modern hierarchical manycore architectures. In this paper, we study the benefits and limits of replacing the highly specialized internal scheduler of the PaStiX solver with two generic runtime systems: PaRSEC and StarPU. The tasks graph of the factorization step is made available to the two runtimes, providing them the opportunity to process and optimize its traversal in order to maximize the algorithm efficiency for the targeted hardware platform. A comparative study of the performance of the PaStiX solver on top of its native internal scheduler, PaRSEC, and StarPU frameworks, on different execution environments, is performed. The analysis highlights that these generic task-based runtimes achieve comparable results to the application-optimized embedded scheduler on homogeneous platforms. Furthermore, they are able to significantly speed up the solver on heterogeneous environments by taking advantage of the accelerators while hiding the complexity of their efficient manipulation from the programmer.

Proceedings ArticleDOI
15 Feb 2014
TL;DR: The contribution is to provide a compiler methodology to automatically generate the access-phases for a task-based programming system and shows that the automatically generated versions improve EDP by 25% on average compared to a coupled execution, without any performance degradation.
Abstract: Traditional compiler approaches to optimize power efficiency aim to adjust voltage and frequency at runtime to match the code characteristics to the hardware (e.g., running memory-bound phases at a lower frequency). However, such approaches are constrained by three factors: (i) voltage-frequency transitions are too slow to be applied at instruction granularity, (ii) larger code regions are seldom unequivocally memory- or compute-bound, and, (iii) the available voltage scaling range for future technologies is rapidly shrinking. These factors necessitate new approaches to address power-efficiency at the code-generation level. This paper proposes one such approach to automatically generate power-efficient code using a decoupled access/execute (DAE) model.In DAE a program is split into tasks, where each task consists of two sufficiently coarse-grained phases to enable effective Dynamic Voltage Frequency Scaling (DVFS): (i) the access-phase for data prefetch (heavily memory-bound), and (ii) the execute-phase that performs the actual computation (heavily compute-bound). Our contribution is to provide a compiler methodology to automatically generate the access-phases for a task-based programming system. Our approach is capable of handling both affine (through a polyhedral analysis) and non-affine codes (through optimized task skeletons). Our evaluation shows that the automatically generated versions improve EDP by 25% on average compared to a coupled execution, without any performance degradation, and surpasses the EDP savings of the corresponding hand-crafted tasks by 5%.

Proceedings ArticleDOI
08 Oct 2014
TL;DR: This paper proposes a distributed run-time WCET controller that works as follows: locally, each critical task regularly checks if the interferences due to the low criticality tasks can be tolerated, otherwise it decides their suspension; globally, a master suspends and restarts the lowcriticality tasks based on the received requests from the critical tasks.
Abstract: When integrating mixed critical systems on a multi/many-core, one challenge is to ensure predictability for high criticality tasks and an increased utilization for low criticality tasks In this paper, we address this problem when several high criticality tasks with different deadlines, periods and offsets are concurrently executed on the system We propose a distributed run-time WCET controller that works as follows: (1) locally, each critical task regularly checks if the interferences due to the low criticality tasks can be tolerated, otherwise it decides their suspension; (2) globally, a master suspends and restarts the low criticality tasks based on the received requests from the critical tasks Our approach has been implemented as a software controller on a real multi-core COTS system with significant gains

Journal ArticleDOI
14 Jun 2014
TL;DR: This work proposes and evaluates the first channel implementation, and presents a case study that maps the fine-grain, recursive task spawning in the Cilk programming language to channels by representing it as a flow graph, and proposes a hardware mechanism that allows wavefronts to yield their execution resources.
Abstract: In general-purpose graphics processing unit (GPGPU) computing, data is processed by concurrent threads execut-ing the same function. This model, dubbed single-instruction/multiple-thread (SIMT), requires programmers to coordinate the synchronous execution of similar opera-tions across thousands of data elements. To alleviate this programmer burden, Gaster and Howes outlined the chan-nel abstraction, which facilitates dynamically aggregating asynchronously produced fine-grain work into coarser-grain tasks. However, no practical implementation has been proposedTo this end, we propose and evaluate the first channel im-plementation. To demonstrate the utility of channels, we present a case study that maps the fine-grain, recursive task spawning in the Cilk programming language to channels by representing it as a flow graph. To support data-parallel recursion in bounded memory, we propose a hardware mechanism that allows wavefronts to yield their execution resources. Through channels and wavefront yield, we im-plement four Cilk benchmarks. We show that Cilk can scale with the GPU architecture, achieving speedups of as much as 4.3x on eight compute units

Proceedings ArticleDOI
01 Dec 2014
TL;DR: Going beyond prior work on linear delay constraints that apply only to serial tasks, this work generalizes the delay constraints to settings where the dependency between tasks can be described by a tree and provides an algorithm, DTP (Deterministic delay constrained Task Partitioning), to solve the offloading decision problem with delay constraints.
Abstract: Computation Offloading, sending computational tasks to more resourceful servers, is becoming a widely-used approach to save limited resources on mobile devices like battery life, storage, processor, etc. Given an application that is partitioned into multiple tasks, the offloading decisions can be made on each of them. However, considering the delay constraint and the extra costs on data transmission and remote computation, it is not trivial to make optimized decisions. Existing works have formulated offloading decision problems as either graph-partitioning or binary integer programming problems. The first approach can solve the problem in polynomial time but is not applicable to delay constraints. The second approach relies on an integer programming solver without a polynomial time guarantee. We provide an algorithm, DTP (Deterministic delay constrained Task Partitioning), to solve the offloading decision problem with delay constraints. DTP gives near-optimal solution and runs in polynomial time in the number of tasks. Going beyond prior work on linear delay constraints that apply only to serial tasks, we generalize the delay constraints to settings where the dependency between tasks can be described by a tree. Furthermore, we provide another algorithm, PTP (Probabilistic delay constrained Task Partitioning), which gives stronger QoS guarantees. Simulation results show that our algorithms are accurate and robust, and scale well with the number of tasks.

Proceedings ArticleDOI
03 Nov 2014
TL;DR: This work proposes a Multi-tAsk MUlti-view Discriminant Analysis (MAMUDA) method that collaboratively learns the feature transformations for different views in different tasks by exploring the shared task-specific and problem intrinsic structures.
Abstract: Multi-task multi-view learning deals with the learning scenarios where multiple tasks are associated with each other through multiple shared feature views. All previous works for this problem assume that the tasks use the same set of class labels. However, in real world there exist quite a few applications where the tasks with several views correspond to different set of class labels. This new learning scenario is called Multi-task Multi-view Learning for Heterogeneous Tasks in this study. Then, we propose a Multi-tAsk MUlti-view Discriminant Analysis (MAMUDA) method to solve this problem. Specifically, this method collaboratively learns the feature transformations for different views in different tasks by exploring the shared task-specific and problem intrinsic structures. Additionally, MAMUDA method is convenient to solve the multi-class classification problems. Finally, the experiments on two real-world problems demonstrate the effectiveness of MAMUDA for heterogeneous tasks.

Patent
07 May 2014
TL;DR: In this article, a distributed asynchronous task queue execution in a cloud environment is described, which consists of an interface device, a task queue device and a plurality of task execution devices.
Abstract: The invention provides a system and a method for distributed asynchronous task queue execution in a cloud environment The system comprises an interface device, a task queue device and a plurality of task execution devices The interface device is used for receiving a task execution request submitted by a user, verifying whether the quota of the user is overrun, and receiving tasks submitted by the user and writing the basic information of the tasks into a database under the condition that the quota of the user is not overrun The task queue device is used for judging whether the tasks submitted by the user are tasks needing to be executed in a parallel mode or tasks needing to be executed in a serial mode, and correspondingly putting the tasks submitted by the user in a parallel task queue and a serial task queue The task execution devices are used for executing parallel tasks and serial tasks pushed by the task queue device, the task execution devices judge whether the degree of concurrency exceeds the limit of the user when the task queue device pushes parallel tasks, and the task execution devices execute the parallel tasks when the degree of concurrency does not exceed the limit of the user The system of the invention has the advantages of high task execution efficiency and good performance

Patent
21 Mar 2014
TL;DR: In this paper, a system is provided that includes sensors(s) configured to provide sensed input including measurements of motion and/or orientation of a user during performance of a task to work a complex-system component.
Abstract: A system is provided that includes sensors(s) configured to provide sensed input including measurements of motion and/or orientation of a user during performance of a task to work a complex-system component. The system includes a front-end system configured to process the sensed input including the measurements to identify a known pattern that indicates a significance of the sensed input from which to identify operations of an electronic resource. The front-end system is configured to form and communicate an input to cause the electronic resource to perform the operations and produce an output. The operations include determination of an action of the user during performance of the task, or calculation of a process variable related to performance of the task, from the measurements. And the front-end system is configured to receive the output from the electronic resource, and communicate the output to a display device, audio output device or haptic sensor.

Patent
25 Oct 2014
TL;DR: Computing platform security methods and apparatus are disclosed in this paper, which includes a security application to configure a security task, the security task to detect a malicious element on a computing platform, the computing platform including a central processing unit and a graphics processing unit; and an offloading to determine whether the central processing units or the graphics processing units are to execute the security tasks.
Abstract: Computing platform security methods and apparatus are disclosed. An example apparatus includes a security application to configure a security task, the security task to detect a malicious element on a computing platform, the computing platform including a central processing unit and a graphics processing unit; and an offloader to determine whether the central processing unit or the graphics processing unit is to execute the security task; and when the graphics processing unit is to execute the security task, offload the security task to the graphics processing unit for execution.

Patent
02 Apr 2014
TL;DR: In this article, a task scheduling method, device and system consisting of obtaining computing resource information of computing nodes and allocating the idle computing resources to computing frames according to the information is presented.
Abstract: An embodiment of the invention discloses a task scheduling method, device and system. The method comprises the steps of obtaining computing resource information of computing nodes and allocating the idle computing resources to computing frames according to the information, wherein the computing resource information of the computing nodes includes the using conditions of various types of computing resources of the computing nodes; respectively allocating the idle computing resources obtained by the computing frames to tasks in task queues of the computing frames. By applying the task scheduling method, device and system, diversity of the computing resources is considered when the computing resource information of the computing nodes is obtained, so that the computing resources allocated to the tasks are reasonable.

Journal ArticleDOI
TL;DR: In this paper, the authors model the brain as an organization in which a coordinator allocates limited resources to the brain systems responsible for the dierent tasks, and show that the optimal mechanism is to impose to each system with privately known needs a cap in resources that depends negatively on the amount of resources requested by the other system.
Abstract: When an individual performs several tasks simultaneously, processing resources must be allocated to dierent brain systems to produce energy for neurons to re. Following the evidence from neuroscience, we model the brain as an organization in which a coordinator allocates limited resources to the brain systems responsible for the dierent tasks. Systems are privately informed about the amount of resources necessary to perform their task and compete to obtain the resources. The coordinator arbitrates the demands while satisfying the resource constraint. We show that the optimal mechanism is to impose to each system with privately known needs a cap in resources that depends negatively on the amount of resources requested by the other system. This allocation can be implemented using a biologically plausible mechanism. Finally, we provide some implications of our theory: (i) performance can be awless for suciently simple tasks, (ii) the dynamic allocation rule exhibits inertia (current allocations are increasing in past needs), and (iii) dierent cognitive tasks are performed by dierent systems only if the tasks are suciently important.

Proceedings Article
05 Sep 2014
TL;DR: Credible computationally-efficient heuristics are developed to address the problem of large-scale mobile crowd-tasking, and it is shown that a specific heuristic increases the fraction of assigned tasks, and reduces the average detour overhead by more than 60%, compared to the current decentralized approach.
Abstract: We investigate the problem of large-scale mobile crowd-tasking, where a large pool of citizen crowd-workers are used to perform a variety of location-specific urban logistics tasks Current approaches to such mobile crowd-tasking are very decentralized: a crowd-tasking platform usually provides each worker a set of available tasks close to the worker's current location; each worker then independently chooses which tasks she wants to accept and perform In contrast, we propose TRACCS, a more coordinated task assignment approach, where the crowd-tasking platform assigns a sequence of tasks to each worker, taking into account their expected location trajectory over a wider time horizon, as opposed to just instantaneous location We formulate such task assignment as an optimization problem, that seeks to maximize the total payoff from all assigned tasks, subject to a maximum bound on the detour (from the expected path) that a worker will experience to complete her assigned tasks We develop credible computationally-efficient heuristics to address this optimization problem (whose exact solution requires solving a complex integer linear program), and show, via simulations with realistic topologies and commuting patterns, that a specific heuristic (called Greedy-ILS) increases the fraction of assigned tasks by more than 20%, and reduces the average detour overhead by more than 60%, compared to the current decentralized approach

Journal ArticleDOI
TL;DR: This work will use one step of policy iteration from a starting policy such as Bernoulli splitting, in order to derive efficient task assignment (dispatching) policies that minimize the long-run average cost.

Patent
An Yan1
30 May 2014
TL;DR: In this article, the authors propose an architecture that facilitates a user experience for continuing computer and/or application tasks across user devices via a cloud service or via a short range wireless peer-to-peer (P2P) architecture.
Abstract: Architecture that facilitates a user experience for continuing computer and/or application tasks across user devices. Task status can be synchronized across devices via a cloud service or via a short range wireless peer-to-peer (P2P). When applied to searching, for example, the user experience enables users to resume the same search session across devices in several ways. The disclosed architecture can also be extended to other tasks such as web browsing, online meetings, office application sessions, etc. The client application of each device collects the states of each application (e.g., document links, websites, online meeting information, etc.) as part of the synchronization, and uses the states to resume the same applications on different devices (e.g., open the same word processing document, a browser to the same websites, re-join online meetings, etc.).

Proceedings Article
27 Jul 2014
TL;DR: In this paper, a task allocation algorithm that allows tasks to be easily sequenced to yield high-quality solutions is proposed, which is based on a Fisher market with agents as buyers and tasks as goods.
Abstract: Realistic multi-agent team applications often feature dynamic environments with soft deadlines that penalize late execution of tasks This puts a premium on quickly allocating tasks to agents, but finding the optimal allocation is NP-hard due to temporal and spatial constraints that require tasks to be executed sequentially by agents We propose FMC TA, a novel task allocation algorithm that allows tasks to be easily sequenced to yield high-quality solutions FMC_TA first finds allocations that are fair (envyfree), balancing the load and sharing important tasks between agents, and efficient (Pareto optimal) in a simplified version of the problem It computes such allocations in polynomial or pseudo-polynomial time (centrally or distributedly, respectively) using a Fisher market with agents as buyers and tasks as goods It then heuristically schedules the allocations, taking into account inter-agent constraints on shared tasks We empirically compare our algorithm to state-of-the-art incomplete methods, both centralized and distributed, on law enforcement problems inspired by real police logs The results show a clear advantage for FMC TA both in total utility and in other measures commonly used by law enforcement authorities

Patent
Matej Dusik1, Dinkar Mylaraswamy1, Jiri Vasek1, Jindrich Finda1, Michal Kosik1 
27 Oct 2014
TL;DR: In this paper, a maintenance assistance system and method of operating is described, which includes, but is not limited to, a camera, a heads-up display, a memory configured to maintenance task data, and a processor communicatively coupled to the camera, the headsup display and the memory, the processor configured to determine a component to be serviced, determine a location of the component based upon data from the camera and the maintenance tasks stored in the memory.
Abstract: A maintenance assistance system and method of operating are provided. The maintenance assistance system may include, but is not limited to, a camera, a heads-up display, a memory configured to maintenance task data, and a processor communicatively coupled to the camera, the heads-up display and the memory, the processor configured to determine a component to be serviced, determine a location of the component based upon data from the camera and the maintenance task data stored in the memory, generate graphical data based upon a maintenance step associated with the component, and output the generated graphical data to the heads-up display

Journal ArticleDOI
TL;DR: Algorithms for effective slot selection of linear complexity on an available slots number are studied and compared with known approaches and the novelty of the proposed approach consists of allocating alternative sets of slots.
Abstract: In this work, we introduce slot selection and co-allocation algorithms for parallel jobs in distributed computing with non-dedicated and heterogeneous resources. A single slot is a time span that can be assigned to a task, which is a part of a job. The job launch requires a co-allocation of a specified number of slots starting synchronously. The challenge is that slots associated with different resources of distributed computational environments may have arbitrary start and finish points that do not match. Some existing algorithms assign a job to the first set of slots matching the resource request without any optimization (the first fit type), while other algorithms are based on an exhaustive search. In this paper, algorithms for effective slot selection of linear complexity on an available slots number are studied and compared with known approaches. The novelty of the proposed approach consists of allocating alternative sets of slots. It provides possibilities to optimize job scheduling.

Journal ArticleDOI
01 Jul 2014
TL;DR: The fundamental abstractions underlying the programming model, as well as performance, determinism, and fault resilience considerations, are discussed and a pilot C++ library implementation for clusters of multicore machines is presented.
Abstract: We propose Chunks and Tasks, a parallel programming model built on abstractions for both data and work. The application programmer specifies how data and work can be split into smaller pieces, chunks and tasks, respectively. The Chunks and Tasks library maps the chunks and tasks to physical resources. In this way we seek to combine user friendliness with high performance. An application programmer can express a parallel algorithm using a few simple building blocks, defining data and work objects and their relationships. No explicit communication calls are needed; the distribution of both work and data is handled by the Chunks and Tasks library. This makes efficient implementation of complex applications that require dynamic distribution of work and data easier. At the same time, Chunks and Tasks imposes restrictions on data access and task dependencies that facilitate the development of high performance parallel back ends. We discuss the fundamental abstractions underlying the programming model, as well as performance, determinism, and fault resilience considerations. We also present a pilot C++ library implementation for clusters of multicore machines and demonstrate its performance for irregular block-sparse matrix-matrix multiplication.