scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Artificial Intelligence Research in 2019"


Journal ArticleDOI
TL;DR: Cross-lingual representations of words enable us to reason about word meaning in multilingual contexts and are a key facilitator of crosslingual transfer when developing natural language processin... as discussed by the authors.
Abstract: Cross-lingual representations of words enable us to reason about word meaning in multilingual contexts and are a key facilitator of cross-lingual transfer when developing natural language processin...

288 citations


Journal ArticleDOI
TL;DR: A taxonomy of solutions for the general knowledge reuse problem is defined, providing a comprehensive discussion of recent progress on knowledge reuse in Multiagent Systems (MAS) and of techniques for knowledge reuse across agents (that may be actuating in a shared environment or not).
Abstract: Multiagent Reinforcement Learning (RL) solves complex tasks that require coordination with other agents through autonomous exploration of the environment. However, learning a complex task from scra...

163 citations


Journal ArticleDOI
TL;DR: A unified notation is introduced for evaluation methods, applications, as well as off-the-shelf LI systems that do not require training by the end user, to propose future directions for research in LI.
Abstract: Language identification (“LI”) is the problem of determining the natural language that a document or part thereof is written in. Automatic LI has been extensively researched for over fifty years. Today, LI is a key part of many text processing pipelines, as text processing techniques generally assume that the language of the input text is known. Research in this area has recently been especially active. This article provides a brief history of LI research, and an extensive survey of the features and methods used in the LI literature. We describe the features and methods using a unified notation, to make the relationships between methods clearer. We discuss evaluation methods, applications of LI, as well as off-the-shelfLI systems that do not require training by the end user. Finally, we identify open issues, survey the work to date on each issue, and propose future directions for research in LI.

133 citations


Journal ArticleDOI
TL;DR: Recognizing that any AI system has humans in the loop, HitAI will reward these aware and unaware knowledge producers with a different scheme: decisions of AI systems generating revenues will repay the legitimate owners of the knowledge used for taking those decisions.
Abstract: Little by little, newspapers are revealing the bright future that Artificial Intelligence (AI) is building. Intelligent machines will help everywhere. However, this bright future has a dark side: a dramatic job market contraction before its unpredictable transformation. Hence, in a near future, large numbers of job seekers will need financial support while catching up with these novel unpredictable jobs. This possible job market crisis has an antidote inside. In fact, the rise of AI is sustained by the biggest knowledge theft of the recent years. Learning AI machines are extracting knowledge from unaware skilled or unskilled workers by analyzing their interactions. By passionately doing their jobs, these workers are digging their own graves. In this paper, we propose Human-in-the-loop Artificial Intelligence (HIT-AI) as a fairer paradigm for Artificial Intelligence systems. HIT-AI will reward aware and unaware knowledge producers with a different scheme: decisions of AI systems generating revenues will repay the legitimate owners of the knowledge used for taking those decisions. As modern Robin Hoods, HIT-AI researchers should fight for a fairer Artificial Intelligence that gives back what it steals.

107 citations


Journal ArticleDOI
TL;DR: This work extends three leading Dec-POMDP algorithms for policy generation to the macro-action case, and can synthesize control policies that exploit opportunities for coordination while balancing uncertainty, sensor information, and information about other agents.
Abstract: Decentralized partially observable Markov decision processes (Dec-POMDPs) are general models for decentralized multi-agent decision making under uncertainty. However, they typically model a problem at a low level of granularity, where each agent's actions are primitive operations lasting exactly one time step. We address the case where each agent has macro-actions: temporally extended actions that may require different amounts of time to execute. We model macro-actions as options in a Dec-POMDP, focusing on actions that depend only on information directly available to the agent during execution. Therefore, we model systems where coordination decisions only occur at the level of deciding which macro-actions to execute. The core technical difficulty in this setting is that the options chosen by each agent no longer terminate at the same time. We extend three leading Dec-POMDP algorithms for policy generation to the macro-action case, and demonstrate their effectiveness in both standard benchmarks and a multi-robot coordination problem. The results show that our new algorithms retain agent coordination while allowing high-quality solutions to be generated for significantly longer horizons and larger state-spaces than previous Dec-POMDP methods. Furthermore, in the multi-robot domain, we show that, in contrast to most existing methods that are specialized to a particular problem class, our approach can synthesize control policies that exploit opportunities for coordination while balancing uncertainty, sensor information, and information about other agents.

57 citations


Journal ArticleDOI
TL;DR: In this article, the authors describe an architecture for robots that combines the complementary strengths of probabilistic graphical models and declarative programming to represent and reason with logic-based descriptions of uncertainty and domain knowledge.
Abstract: This paper describes an architecture for robots that combines the complementary strengths of probabilistic graphical models and declarative programming to represent and reason with logic-based and probabilistic descriptions of uncertainty and domain knowledge. An action language is extended to support non-boolean fluents and non-deterministic causal laws. This action language is used to describe tightly-coupled transition diagrams at two levels of granularity, with a fine-resolution transition diagram defined as a refinement of a coarse-resolution transition diagram of the domain. The coarse-resolution system description, and a history that includes (prioritized) defaults, are translated into an Answer Set Prolog (ASP) program. For any given goal, inference in the ASP program provides a plan of abstract actions. To implement each such abstract action, the robot automatically zooms to the part of the fine-resolution transition diagram relevant to this action. A probabilistic representation of the uncertainty in sensing and actuation is then included in this zoomed fine-resolution system description, and used to construct a partially observable Markov decision process (POMDP). The policy obtained by solving the POMDP is invoked repeatedly to implement the abstract action as a sequence of concrete actions, with the corresponding observations being recorded in the coarse-resolution history and used for subsequent reasoning. The architecture is evaluated in simulation and on a mobile robot moving objects in an indoor domain, to show that it supports reasoning with violation of defaults, noisy observations and unreliable actions, in complex domains.

43 citations


Journal ArticleDOI
TL;DR: This work shows that query answering under the intractable AR semantics can be performed efficiently by using IAR and brave semantics as tractable approximations and encoding the AR entailment problem as a propositional satisfiability (SAT) problem.
Abstract: Several inconsistency-tolerant semantics have been introduced for querying inconsistent description logic knowledge bases. The first contribution of this paper is a practical approach for computing...

41 citations


Journal ArticleDOI
TL;DR: In this article, the authors identify several pitfalls in the experimental design that can render the procedure ineffective and propose best practices for avoiding them, as well as a tool called Generic Wrapper4AC.
Abstract: Good parameter settings are crucial to achieve high performance in many areas of artificial intelligence (AI), such as propositional satisfiability solving, AI planning, scheduling, and machine learning (in particular deep learning). Automated algorithm configuration methods have recently received much attention in the AI community since they replace tedious, irreproducible and error-prone manual parameter tuning and can lead to new state-of-the-art performance. However, practical applications of algorithm configuration are prone to several (often subtle) pitfalls in the experimental design that can render the procedure ineffective. We identify several common issues and propose best practices for avoiding them. As one possibility for automatically handling as many of these as possible, we also propose a tool called GenericWrapper4AC.

38 citations


Journal ArticleDOI
TL;DR: This work shows that, in some cases with n agents, no allocation can guarantee better than 1/n approximation of a fair allocation when the entitlements are not necessarily equal, and devise a simple algorithm that ensures a 1/ n approximation guarantee.
Abstract: We study fair allocation of indivisible goods to agents with unequal entitlements. Fair allocation has been the subject of many studies in both divisible and indivisible settings. Our emphasis is o...

37 citations


Journal ArticleDOI
TL;DR: This paper addresses the problem of multi-agent inverse reinforcement learning (MIRL) in a two-player general-sum stochastic game framework with novel approaches to address these five problems under the assumption that the game observer either knows or is able to accurately estimate the policies and solution concepts for players.
Abstract: This paper addresses the problem of multi-agent inverse reinforcement learning (MIRL) in a two-player general-sum stochastic game framework. Five variants of MIRL are considered: uCS-MIRL, advE-MIRL, cooE-MIRL, uCE-MIRL, and uNE-MIRL, each distinguished by its solution concept. Problem uCS-MIRL is a cooperative game in which the agents employ cooperative strategies that aim to maximize the total game value. In problem uCE-MIRL, agents are assumed to follow strategies that constitute a correlated equilibrium while maximizing total game value. Problem uNE-MIRL is similar to uCE-MIRL in total game value maximization, but it is assumed that the agents are playing a Nash equilibrium. Problems advE-MIRL and cooE-MIRL assume agents are playing an adversarial equilibrium and a coordination equilibrium, respectively. We propose novel approaches to address these five problems under the assumption that the game observer either knows or is able to accurate estimate the policies and solution concepts for players. For uCS-MIRL, we first develop a characteristic set of solutions ensuring that the observed bi-policy is a uCS and then apply a Bayesian inverse learning method. For uCE-MIRL, we develop a linear programming problem subject to constraints that define necessary and sufficient conditions for the observed policies to be correlated equilibria. The objective is to choose a solution that not only minimizes the total game value difference between the observed bi-policy and a local uCS, but also maximizes the scale of the solution. We apply a similar treatment to the problem of uNE-MIRL. The remaining two problems can be solved efficiently by taking advantage of solution uniqueness and setting up a convex optimization problem. Results are validated on various benchmark grid-world games.

34 citations


Journal ArticleDOI
TL;DR: Two new sampling-based DCOP algorithms are introduced called Sequential Distributed Gibbs (SD-Gibbs) and Parallel Distributes Gibbs (PD-G Gibbs) that have memory requirements per agent that is linear in the number of agents in the problem.
Abstract: Researchers have used distributed constraint optimization problems (DCOPs) to model various multi-agent coordination and resource allocation problems. Very recently, Ottens et al. proposed a promising new approach to solve DCOPs that is based on confidence bounds via their Distributed UCT (DUCT) sampling-based algorithm. Unfortunately, its memory requirement per agent is exponential in the number of agents in the problem, which prohibits it from scaling up to large problems. Thus, in this article, we introduce two new sampling-based DCOP algorithms called Sequential Distributed Gibbs (SD-Gibbs) and Parallel Distributed Gibbs (PD-Gibbs). Both algorithms have memory requirements per agent that is linear in the number of agents in the problem. Our empirical results show that our algorithms can find solutions that are better than DUCT, run faster than DUCT, and solve some large problems that DUCT failed to solve due to memory limitations.

Journal ArticleDOI
TL;DR: In this paper, the multi-fidelity bandit problem is formulated as a Gaussian process bandit, where the target function and its approximations are sampled from a Gaussian process.
Abstract: In many scientific and engineering applications, we are tasked with the maximisation of an expensive to evaluate black box function $f$. Traditional settings for this problem assume just the availability of this single function. However, in many cases, cheap approximations to $f$ may be obtainable. For example, the expensive real world behaviour of a robot can be approximated by a cheap computer simulation. We can use these approximations to eliminate low function value regions cheaply and use the expensive evaluations of $f$ in a small but promising region and speedily identify the optimum. We formalise this task as a \emph{multi-fidelity} bandit problem where the target function and its approximations are sampled from a Gaussian process. We develop MF-GP-UCB, a novel method based on upper confidence bound techniques. In our theoretical analysis we demonstrate that it exhibits precisely the above behaviour, and achieves better regret than strategies which ignore multi-fidelity information. Empirically, MF-GP-UCB outperforms such naive strategies and other multi-fidelity methods on several synthetic and real experiments.

Journal ArticleDOI
TL;DR: In this article, the authors study the community structure of industrial SAT instances and show that most application benchmarks are characterized by a high modularity, whereas random SAT instances are closer to the classical Erdos-Renyi random graph model, where no structure can be observed.
Abstract: Modern SAT solvers have experienced a remarkable progress on solving industrial instances. Most of the techniques have been developed after an intensive experimental process. It is believed that these techniques exploit the underlying structure of industrial instances. However, there are few works trying to exactly characterize the main features of this structure. The research community on complex networks has developed techniques of analysis and algorithms to study real-world graphs that can be used by the SAT community. Recently, there have been some attempts to analyze the structure of industrial SAT instances in terms of complex networks, with the aim of explaining the success of SAT solving techniques, and possibly improving them. In this paper, inspired by the results on complex networks, we study the community structure, or modularity, of industrial SAT instances. In a graph with clear community structure, or high modularity, we can find a partition of its nodes into communities such that most edges connect variables of the same community. In our analysis, we represent SAT instances as graphs, and we show that most application benchmarks are characterized by a high modularity. On the contrary, random SAT instances are closer to the classical Erdos-Renyi random graph model, where no structure can be observed. We also analyze how this structure evolves by the effects of the execution of a CDCL SAT solver. In particular, we use the community structure to detect that new clauses learned by the solver during the search contribute to destroy the original structure of the formula. This is, learned clauses tend to contain variables of distinct communities.

Journal ArticleDOI
TL;DR: This extended version of earlier work generalises the framework to the continuous domain and discusses the results, including the conditions under which the findings can be generalised back to goal recognition in general task-planning.
Abstract: Goal recognition is the problem of determining an agent's intent by observing her behaviour. Contemporary solutions for general task-planning relate the probability of a goal to the cost of reaching it. We adapt this approach to goal recognition in the strict context of path-planning. We show (1) that a simpler formula provides an identical result to current state-of-the-art in less than half the time under all but one set of conditions. Further, we prove (2) that the probability distribution based on this technique is independent of an agent's past behaviour and present a revised formula that achieves goal recognition by reference to the agent's starting point and current location only. Building on this, we demonstrate (3) that a Radius of Maximum Probability (i.e., the distance from a goal within which that goal is guaranteed to be the most probable) can be calculated from relative cost-distances between the candidate goals and a start location, without needing to calculate any actual probabilities. In this extended version of earlier work, we generalise our framework to the continuous domain and discuss our results, including the conditions under which our findings can be generalised back to goal recognition in general task-planning.

Journal ArticleDOI
TL;DR: This paper presents a general point-based value iteration algorithm for finite-horizon POMDP problems which provides solutions with guarantees on solution quality and introduces two heuristics to reduce the number of belief points considered during execution, which lowers the computational requirements.
Abstract: Partially Observable Markov Decision Processes (POMDPs) are a popular formalism for sequential decision making in partially observable environments. Since solving POMDPs to optimality is a difficult task, point-based value iteration methods are widely used. These methods compute an approximate POMDP solution, and in some cases they even provide guarantees on the solution quality, but these algorithms have been designed for problems with an infinite planning horizon. In this paper we discuss why state-of-the-art point-based algorithms cannot be easily applied to finite-horizon problems that do not include discounting. Subsequently, we present a general point-based value iteration algorithm for finite-horizon problems which provides solutions with guarantees on solution quality. Furthermore, we introduce two heuristics to reduce the number of belief points considered during execution, which lowers the computational requirements. In experiments we demonstrate that the algorithm is an effective method for solving finite-horizon POMDPs.

Journal ArticleDOI
TL;DR: A fully-automated encoding is provided to translate any CSTNUR into an equivalent timed game automaton in polynomial time for a sound and complete DC-checking and RRCs restrict resource availability by further temporal constraints among resources.
Abstract: Conditional simple temporal networks with uncertainty (CSTNUs) allow for the representation of temporal plans subject to both conditional constraints and uncertain durations. Dynamic controllability (DC) of CSTNUs ensures the existence of an execution strategy able to execute the network in real time (i.e., scheduling the time points under control) depending on how these two uncontrollable parts behave. However, CSTNUs do not deal with resources. In this paper, we define conditional simple temporal networks with uncertainty and resources (CSTNURs) by injecting resources and runtime resource constraints (RRCs) into the specification. Resources are mandatory for executing the time points and their availability is represented through temporal expressions, whereas RRCs restrict resource availability by further temporal constraints among resources. We provide a fully-automated encoding to translate any CSTNUR into an equivalent timed game automaton in polynomial time for a sound and complete DC-checking.

Journal ArticleDOI
TL;DR: This article presents a comprehensive survey of the research from the past decades on temporal reasoning for automatic temporal information extraction from text, providing a case study on the integration of symbolic reasoning with machine learning-based information extraction systems.
Abstract: Time is deeply woven into how people perceive, and communicate about the world. Almost unconsciously, we provide our language utterances with temporal cues, like verb tenses, and we can hardly produce sentences without such cues. Extracting temporal cues from text, and constructing a global temporal view about the order of described events is a major challenge of automatic natural language understanding. Temporal reasoning, the process of combining different temporal cues into a coherent temporal view, plays a central role in temporal information extraction. This article presents a comprehensive survey of the research from the past decades on temporal reasoning for automatic temporal information extraction from text, providing a case study on how combining symbolic reasoning with machine learning-based information extraction systems can improve performance. It gives a clear overview of the used methodologies for temporal reasoning, and explains how temporal reasoning can be, and has been successfully integrated into temporal information extraction systems. Based on the distillation of existing work, this survey also suggests currently unexplored research areas. We argue that the level of temporal reasoning that current systems use is still incomplete for the full task of temporal information extraction, and that a deeper understanding of how the various types of temporal information can be integrated into temporal reasoning is required to drive future research in this area.

Journal ArticleDOI
TL;DR: A novel approach for dense captioning based on hourglass-structured residual learning is put forward, which outperforms most current methods on the Visual Genome V1.0 dataset.
Abstract: Recent research on dense captioning based on the recurrent neural network and the convolutional neural network has made a great progress. However, mapping from an image feature space to a description space is a nonlinear and multimodel task, which makes it difficult for the current methods to get accurate results. In this paper, we put forward a novel approach for dense captioning based on hourglass-structured residual learning. Discriminant feature maps are obtained by incorporating dense connected networks and residual learning in our model. Finally, the performance of the approach on the Visual Genome V1.0 dataset and the region labelled MS-COCO (Microsoft Common Objects in Context) dataset are demonstrated. The experimental results have shown that our approach outperforms most current methods.

Journal ArticleDOI
TL;DR: In this paper, the tradeoff between asymptotic bias and overfitting in the context of reinforcement learning with partial observability is analyzed. But this paper focuses on the partially observable setting.
Abstract: This paper provides an analysis of the tradeoff between asymptotic bias (suboptimality with unlimited data) and overfitting (additional suboptimality due to limited data) in the context of reinforcement learning with partial observability. Our theoretical analysis formally characterizes that while potentially increasing the asymptotic bias, a smaller state representation decreases the risk of overfitting. This analysis relies on expressing the quality of a state representation by bounding $L_1$ error terms of the associated belief states. Theoretical results are empirically illustrated when the state representation is a truncated history of observations, both on synthetic POMDPs and on a large-scale POMDP in the context of smartgrids, with real-world data. Finally, similarly to known results in the fully observable setting, we also briefly discuss and empirically illustrate how using function approximators and adapting the discount factor may enhance the tradeoff between asymptotic bias and overfitting in the partially observable context.

Journal ArticleDOI
TL;DR: A strategy-proof mechanism is designed that works in polynomial-time for computing a pairwise stable matching in typed markets in which students are partitioned into types that induce their possible wages.
Abstract: We investigate markets with a set of students on one side and a set of colleges on the other. A student and college can be linked by a weighted contract that defines the student's wage, while a college's budget for hiring students is limited. Stability is a crucial requirement for matching mechanisms to be applied in the real world. A standard stability requirement is coalitional stability, i.e., no pair of a college and group of students has any incentive to deviate. We find that a coalitionally stable matching is not guaranteed to exist, verifying the coalitional stability for a given matching is coNP-complete, and the problem of finding whether a coalitionally stable matching exists in a given market, is Sigma(P)(2)-complete: NPNP -complete. Other negative results also hold when blocking coalitions contain at most two students and one college. Given these computational hardness results, we pursue a weaker stability requirement called pairwise stability, where no pair of a college and single student has an incentive to deviate. Unfortunately, a pairwise stable matching is not guaranteed to exist either. Thus, we consider a restricted market called a typed weighted market, in which students are partitioned into types that induce their possible wages. We then design a strategy-proof and Pareto efficient mechanism that works in polynomial-time for computing a pairwise stable matching in typed weighted markets.

Journal ArticleDOI
TL;DR: An experimental survey of state-of-the-art vector factor models is conducted, not towards a purely comparative end, but as a means to get insight about their inductive abilities.
Abstract: Latent factor models are increasingly popular for modeling multi-relational knowledge graphs. By their vectorial nature, it is not only hard to interpret why this class of models works so well, but also to understand where they fail and how they might be improved. We conduct an experimental survey of state-of-the-art models, not towards a purely comparative end, but as a means to get insight about their inductive abilities. To assess the strengths and weaknesses of each model, we create simple tasks that exhibit first, atomic properties of binary relations, and then, common inter-relational inference through synthetic genealogies. Based on these experimental results, we propose new research directions to improve on existing models.

Journal ArticleDOI
TL;DR: This paper shows how the existence of a manifold (positive average pairwise task correlation) can be analysed in AI, and how this relates to the notion of agent generality, from the individual and the populational points of view.
Abstract: I thank the anonymous reviewers of ECAI'2016 for their comments on an early version of the experiments shown in Section 4. I'm really grateful to Philip J. Bontrager, Ahmed Khalifa, Diego Perez-Liebana and Julian Togelius for providing me with the GVGAI competition data that made Section 3 possible. David Stillwell and Aiden Loe suggested the use of person-fit as a measure of generality. The JAIR reviewers have provided very insightful and constructive comments, which have greatly helped to improve the final version of this paper. This work has been partially supported by the EU (FEDER) and Spanish MINECO grant TIN2015-69175-C4-1-R, and by Generalitat Valenciana PROMETEOII/2015/013 and PROMETEO/2019/098. I also thank the support from the Future of Life Institute through FLI grant RFP2-152. Part of this work has been done while visiting the Leverhulme Centre for the Future of Intelligence, generously funded by the Leverhulme Trust. I also thank the UPV for granting me a sabbatical leave and the funding from the Spanish MECD programme "Salvador de Madariaga" (PRX17/00467) and a BEST grant (BEST/2017/045) from the Generalitat Valenciana for another research stay also at the CFI.

Journal ArticleDOI
TL;DR: This account is based on a novel channel-based approach to Bayesian probability, and describes these two approaches as different ways of updating with soft evidence, highlighting their differences, similarities and applications.
Abstract: Evidence in probabilistic reasoning may be 'hard' or 'soft', that is, it may be of yes/no form, or it may involve a strength of belief, in the unit interval [0, 1]. Reasoning with soft, [0, 1]-valued evidence is important in many situations but may lead to different, confusing interpretations. This paper intends to bring more mathematical and conceptual clarity to the field by shifting the existing focus from specification of soft evidence to accomodation of soft evidence. There are two main approaches, known as Jeffrey's rule and Pearl's method; they give different outcomes on soft evidence. This paper argues that they can be understood as correction and as improvement. It describes these two approaches as different ways of updating with soft evidence, highlighting their differences, similarities and applications. This account is based on a novel channel-based approach to Bayesian probability. Proper understanding of these two update mechanisms is highly relevant for inference, decision tools and probabilistic programming languages.

Journal ArticleDOI
TL;DR: This article provides a combined operational semantics for goals and commitments by relating their respective life cycles as a basis for how these concepts cohere for an individual agent and engender cooperation among agents.
Abstract: Commitments capture how an agent relates to another agent, whereas goals describe states of the world that an agent is motivated to bring about. Commitments are elements of the social state of a set of agents whereas goals are elements of the private states of individual agents. It makes intuitive sense that goals and commitments are understood as being complementary to each other. More importantly, an agent’s goals and commitments ought to be coherent, in the sense that an agent’s goals would lead it to adopt or modify relevant commitments and an agent’s commitments would lead it to adopt or modify relevant goals. However, despite the intuitive naturalness of the above connections, they have not been adequately studied in a formal framework. This article provides a combined operational semantics for goals and commitments by relating their respective life cycles as a basis for how these concepts (1) cohere for an individual agent and (2) engender cooperation among agents. Our semantics yields important desirable properties of convergence of the configurations of cooperating agents, thereby delineating some theoretically well-founded yet practical modes of cooperation in a multiagent system.

Journal ArticleDOI
TL;DR: In this article, the authors proposed an iterative local voting (ILV) algorithm for collective decision-making in high-dimensional continuous spaces, where voters are sequentially sampled and asked to modify a candidate solution within some local neighborhood of its current value, defined by a ball in some chosen norm, with the size of the ball shrinking at a specified rate.
Abstract: Many societal decision problems lie in high-dimensional continuous spaces not amenable to the voting techniques common for their discrete or single-dimensional counterparts. These problems are typically discretized before running an election or decided upon through negotiation by representatives. We propose a algorithm called {\sc Iterative Local Voting} for collective decision-making in this setting. In this algorithm, voters are sequentially sampled and asked to modify a candidate solution within some local neighborhood of its current value, as defined by a ball in some chosen norm, with the size of the ball shrinking at a specified rate. We first prove the convergence of this algorithm under appropriate choices of neighborhoods to Pareto optimal solutions with desirable fairness properties in certain natural settings: when the voters' utilities can be expressed in terms of some form of distance from their ideal solution, and when these utilities are additively decomposable across dimensions. In many of these cases, we obtain convergence to the societal welfare maximizing solution. We then describe an experiment in which we test our algorithm for the decision of the U.S. Federal Budget on Mechanical Turk with over 2,000 workers, employing neighborhoods defined by $\mathcal{L}^1, \mathcal{L}^2$ and $\mathcal{L}^\infty$ balls. We make several observations that inform future implementations of such a procedure.

Journal ArticleDOI
TL;DR: The CFR+ algorithm for solving imperfect information games is a variant of the popular CFR algorithm, with faster empirical performance on a range of problems as discussed by the authors, and it was introduced with a theoretical upper bound on solution error, but subsequent work showed an error in one step of the proof.
Abstract: The CFR+ algorithm for solving imperfect information games is a variant of the popular CFR algorithm, with faster empirical performance on a range of problems. It was introduced with a theoretical upper bound on solution error, but subsequent work showed an error in one step of the proof. We provide updated proofs to recover the original bound.

Journal ArticleDOI
TL;DR: This work introduces multi-labelling systems, a generic formalism devoted to represent reasoning processes consisting of a sequence of labelling stages, and shows how they can be seamlessly integrated into different formalisms.
Abstract: In computational models of argumentation, the justification of statements has drawn less attention than the construction and justification of arguments. As a consequence, significant losses of sensitivity and expressiveness in the treatment of statement statuses can be incurred by otherwise appealing formalisms. In order to reappraise statement statuses and, more generally, to support a uniform modelling of different phases of the argumentation process we introduce multi-labelling systems, a generic formalism devoted to represent reasoning processes consisting of a sequence of labelling stages. In this context, two families of multi-labelling systems, called argument-focused and statement-focused approach, are identified and compared. Then they are shown to be able to encompass several prominent literature proposals as special cases, thereby enabling a systematic comparison evidencing their merits and limits. Further, we show that the proposed model supports tunability of statement justification by specifying a few alternative statement justification labellings, and we illustrate how they can be seamlessly integrated into different formalisms.

Journal ArticleDOI
TL;DR: The worst case distinctiveness (WCD) measure is suggested, which represents the maximal cost of a path an agent may follow before its goal can be inferred by a goal recognition system.
Abstract: Goal recognition design (GRD) facilitates understanding the goals of acting agents through the analysis and redesign of goal recognition models, thus offering a solution for assessing and minimizing the maximal progress of any agent in the model before goal recognition is guaranteed. In a nutshell, given a model of a domain and a set of possible goals, a solution to a GRD problem determines (1) the extent to which actions performed by an agent within the model reveal the agent’s objective; and (2) how best to modify the model so that the objective of an agent can be detected as early as possible. This approach is relevant to any domain in which rapid goal recognition is essential and the model design can be controlled. Applications include intrusion detection, assisted cognition, computer games, and human-robot collaboration. A GRD problem has two components: the analyzed goal recognition setting, and a design model specifying the possible ways the environment in which agents act can be modified so as to facilitate recognition. This work formulates a general framework for GRD in deterministic and partially observable environments, and offers a toolbox of solutions for evaluating and optimizing model quality for various settings. For the purpose of evaluation we suggest the worst case distinctiveness (WCD) measure, which represents the maximal cost of a path an agent may follow before its goal can be inferred by a goal recognition system. We offer novel compilations to classical planning for calculating WCD in settings where agents are bounded-suboptimal. We then suggest methods for minimizing WCD by searching for an optimal redesign strategy within the space of possible modifications, and using pruning to increase efficiency. We support our approach with an empirical evaluation that measures WCD in a variety of GRD settings and tests the efficiency of our compilation-based methods for computing it. We also examine the effectiveness of reducing WCD via redesign and the performance gain brought about by our proposed pruning strategy.

Journal ArticleDOI
TL;DR: A stochastic optimization model to find an optimal Partial-order Schedule (POS) that minimizes the expected makespan is proposed and can cover both the time-dependent uncertainty studied in this paper and the traditional time-independent duration uncertainty.
Abstract: In real-world project scheduling applications, activity durations are often uncertain. Proactive scheduling can effectively cope with the duration uncertainties, by generating robust baseline solutions according to a priori stochastic knowledge. However, most of the existing proactive approaches assume that the duration uncertainty of an activity is not related to its scheduled start time, which may not hold in many real-world scenarios. In this paper, we relax this assumption by allowing the duration uncertainty to be time-dependent, which is caused by the uncertainty of whether the activity can be executed on each time slot. We propose a stochastic optimization model to find an optimal Partial-order Schedule (POS) that minimizes the expected makespan. This model can cover both the time-dependent uncertainty studied in this paper and the traditional time-independent duration uncertainty. To circumvent the underlying complexity in evaluating a given solution, we approximate the stochastic optimization model based on Sample Average Approximation (SAA). Finally, we design two efficient branch-and-bound algorithms to solve the NP-hard SAA problem. Empirical evaluation confirms that our approach can generate high-quality proactive solutions for a variety of uncertainty distributions.

Journal ArticleDOI
TL;DR: A novel abstractive model is proposed which is conditioned on the article's topics and based entirely on convolutional neural networks, outperforming an oracle extractive system and state-of-the-art abstractive approaches when evaluated automatically and by humans on the extreme summarization dataset.
Abstract: We introduce 'extreme summarization', a new single-document summarization task which aims at creating a short, one-sentence news summary answering the question ``What is the article about?''. We argue that extreme summarization, by nature, is not amenable to extractive strategies and requires an abstractive modeling approach. In the hope of driving research on this task further: (a) we collect a real-world, large scale dataset by harvesting online articles from the British Broadcasting Corporation (BBC); and (b) propose a novel abstractive model which is conditioned on the article's topics and based entirely on convolutional neural networks. We demonstrate experimentally that this architecture captures long-range dependencies in a document and recognizes pertinent content, outperforming an oracle extractive system and state-of-the-art abstractive approaches when evaluated automatically and by humans on the extreme summarization dataset.