scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Artificial Intelligence Research in 2020"


Journal ArticleDOI
TL;DR: In this paper, the authors present an overview of recent studies using Machine Learning and Artificial Intelligence to tackle many aspects of the COVID-19 crisis and highlight the need for international cooperation to maximize the potential of AI in this and future pandemics.
Abstract: COVID-19, the disease caused by the SARS-CoV-2 virus, has been declared a pandemic by the World Health Organization, which has reported over 18 million confirmed cases as of August 5, 2020 In this review, we present an overview of recent studies using Machine Learning and, more broadly, Artificial Intelligence, to tackle many aspects of the COVID-19 crisis We have identified applications that address challenges posed by COVID-19 at different scales, including: molecular, by identifying new or existing drugs for treatment;clinical, by supporting diagnosis and evaluating prognosis based on medical imaging and non-invasive measures;and societal, by tracking both the epidemic and the accompanying infodemic using multiple data sources We also review datasets, tools, and resources needed to facilitate Artificial Intelligence research, and discuss strategic considerations related to the operational implementation of multidisciplinary partnerships and open science We highlight the need for international cooperation to maximize the potential of AI in this and future pandemics ©2020 AI Access Foundation All rights reserved

315 citations


Journal ArticleDOI
TL;DR: A set of tests that provide a bridge between the vast amount of linguistic and philosophical theory about compositionality of language and the successful neural models of language are presented, which uncover the strengths and weaknesses of these three architectures and point to potential areas of improvement.
Abstract: Despite a multitude of empirical studies, little consensus exists on whether neural networks are able to generalise compositionally, a controversy that, in part, stems from a lack of agreement about what it means for a neural model to be compositional. As a response to this controversy, we present a set of tests that provide a bridge between, on the one hand, the vast amount of linguistic and philosophical theory about compositionality of language and, on the other, the successful neural models of language. We collect different interpretations of compositionality and translate them into five theoretically grounded tests for models that are formulated on a task-independent level. In particular, we provide tests to investigate (i) if models systematically recombine known parts and rules (ii) if models can extend their predictions beyond the length they have seen in the training data (iii) if models' composition operations are local or global (iv) if models' predictions are robust to synonym substitutions and (v) if models favour rules or exceptions during training. To demonstrate the usefulness of this evaluation paradigm, we instantiate these five tests on a highly compositional data set which we dub PCFG SET and apply the resulting tests to three popular sequence-to-sequence models: a recurrent, a convolution-based and a transformer model. We provide an in-depth analysis of the results, which uncover the strengths and weaknesses of these three architectures and point to potential areas of improvement.

95 citations


Journal ArticleDOI
TL;DR: The authors trace back the origins of modern NMT architectures to word and sentence embeddings and earlier examples of the encoder-decoder network family and conclude with a survey of recent trends in the field.
Abstract: The field of machine translation (MT), the automatic translation of written text from one natural language into another, has experienced a major paradigm shift in recent years. Statistical MT, which mainly relies on various count-based models and which used to dominate MT research for decades, has largely been superseded by neural machine translation (NMT), which tackles translation with a single neural network. In this work we will trace back the origins of modern NMT architectures to word and sentence embeddings and earlier examples of the encoder-decoder network family. We will conclude with a survey of recent trends in the field.

76 citations


Journal ArticleDOI
TL;DR: This paper employs state-of-the-art methods for training deep neural networks to devise a novel model that is able to learn how to effectively perform logical reasoning in the form of basic ontology reasoning, and shows that the model learned to perform precise ontology Reasoning on diverse and challenging tasks.
Abstract: The ability to conduct logical reasoning is a fundamental aspect of intelligent human behavior, and thus an important problem along the way to human-level artificial intelligence. Traditionally, logic-based symbolic methods from the field of knowledge representation and reasoning have been used to equip agents with capabilities that resemble human logical reasoning qualities. More recently, however, there has been an increasing interest in using machine learning rather than logic-based symbolic formalisms to tackle these tasks. In this paper, we employ state-of-the-art methods for training deep neural networks to devise a novel model that is able to learn how to effectively perform logical reasoning in the form of basic ontology reasoning. This is an important and at the same time very natural logical reasoning task, which is why the presented approach is applicable to a plethora of important real-world problems. We present the outcomes of several experiments, which show that our model is able to learn to perform highly accurate ontology reasoning on very large, diverse, and challenging benchmarks. Furthermore, it turned out that the suggested approach suffers much less from different obstacles that prohibit logic-based symbolic reasoning, and, at the same time, is surprisingly plausible from a biological point of view.

57 citations


Journal ArticleDOI
TL;DR: Methods for using human-robot dialog to improve language understanding for a mobile robot agent that parses natural language to underlying semantic meanings and uses robotic sensors to create multi-modal models of perceptual concepts like red and heavy are presented.
Abstract: The Journal of Artificial Intelligence Research published a piece from UT affiliates pertaining to the methods for using human-robot dialog to improve language understanding for a mobile robot agent.

45 citations


Journal ArticleDOI
TL;DR: It is shown that the professional human translations contained significantly fewer errors, and that perceived quality in human evaluation depends on the choice of raters, the availability of linguistic context, and the creation of reference translations.
Abstract: The quality of machine translation has increased remarkably over the past years, to the degree that it was found to be indistinguishable from professional human translation in a number of empirical investigations. We reassess Hassan et al.’s 2018 investigation into Chinese to English news translation, showing that the finding of human–machine parity was owed to weaknesses in the evaluation design—which is currently considered best practice in the field. We show that the professional human translations contained significantly fewer errors, and that perceived quality in human evaluation depends on the choice of raters, the availability of linguistic context, and the creation of reference translations. Our results call for revisiting current best practices to assess strong machine translation systems in general and human–machine parity in particular, for which we offer a set of recommendations based on our empirical findings.

38 citations


Journal ArticleDOI
TL;DR: Adaptive stress testing (AST) as discussed by the authors is a framework for finding the most likely path to a failure event in simulation, which is suitable for black-box testing of large systems.
Abstract: Finding the most likely path to a set of failure states is important to the analysis of safety-critical systems that operate over a sequence of time steps, such as aircraft collision avoidance systems and autonomous cars. In many applications such as autonomous driving, failures cannot be completely eliminated due to the complex stochastic environment in which the system operates. As a result, safety validation is not only concerned about whether a failure can occur, but also discovering which failures are most likely to occur. This article presents adaptive stress testing (AST), a framework for finding the most likely path to a failure event in simulation. We consider a general black box setting for partially observable and continuous-valued systems operating in an environment with stochastic disturbances. We formulate the problem as a Markov decision process and use reinforcement learning to optimize it. The approach is simulation-based and does not require internal knowledge of the system, making it suitable for black-box testing of large systems. We present formulations for fully observable and partially observable systems. In the latter case, we present a modified Monte Carlo tree search algorithm that only requires access to the pseudorandom number generator of the simulator to overcome partial observability. We also present an extension of the framework, called differential adaptive stress testing (DAST), that can find failures that occur in one system but not in another. This type of differential analysis is useful in applications such as regression testing, where we are concerned with finding areas of relative weakness compared to a baseline. We demonstrate the effectiveness of the approach on an aircraft collision avoidance application, where a prototype aircraft collision avoidance system is stress tested to find the most likely scenarios of near mid-air collision.

36 citations


Journal ArticleDOI
TL;DR: An implementation of a probabilistic first-order logic called TensorLog, in which classes of logical queries are compiled into differentiable functions in a neuralnetwork infrastructure such as Tensorflow or Theano, which enables high-performance deep learning frameworks to be used for tuning the parameters of a Probabilistic logic.
Abstract: We present an implementation of a probabilistic first-order logic called TensorLog, in which classes of logical queries are compiled into differentiable functions in a neuralnetwork infrastructure such as Tensorflow or Theano. This leads to a close integration of probabilistic logical reasoning with deep-learning infrastructure: in particular, it enables high-performance deep learning frameworks to be used for tuning the parameters of a probabilistic logic. The integration with these frameworks enables use of GPU-based parallel processors for inference and learning, making TensorLog the first highly parallellizable probabilistic logic. Experimental results show that TensorLog scales to problems involving hundreds of thousands of knowledge-base triples and tens of thousands of examples.

36 citations


Journal ArticleDOI
TL;DR: This work introduces the notion of a k-robust MAPF plan, which is a plan that can be executed even if a limited number of delays occur, and proposes several robust execution policies.
Abstract: Multi-agent path-finding (MAPF) is the problem of finding a plan for moving a set of agents from their initial locations to their goals without collisions. Following this plan, however, may not be possible due to unexpected events that delay some of the agents. In this work, we propose a holistic solution for MAPF that is robust to such unexpected delays. First, we introduce the notion of a k-robust MAPF plan, which is a plan that can be executed even if a limited number (k) of delays occur. We propose sufficient and required conditions for finding a k-robust plan, and show how to convert several MAPF solvers to find such plans. Then, we propose several robust execution policies. An execution policy is a policy for agents executing a MAPF plan. An execution policy is robust if following it guarantees that the agents reach their goals even if they encounter unexpected delays. Several classes of such robust execution policies are proposed and evaluated experimentally. Finally, we present robust execution policies for cases where communication between the agents may also be delayed. We performed an extensive experimental evaluation in which we compared different algorithms for finding robust MAPF plans, compared different robust execution policies, and studied the interplay between having a robust plan and the performance when using a robust execution policy.

32 citations


Journal ArticleDOI
TL;DR: This study surveys the use of Autoencoder for the imputation of tabular data and considers 26 works published between 2014 and 2020, and shows that Denoising Autoencoders outperform their competitors, particularly the often used statistical methods.
Abstract: Missing data is a problem often found in real-world datasets and it can degrade the performance of most machine learning models. Several deep learning techniques have been used to address this issue, and one of them is the Autoencoder and its Denoising and Variational variants. These models are able to learn a representation of the data with missing values and generate plausible new ones to replace them. This study surveys the use of Autoencoders for the imputation of tabular data and considers 26 works published between 2014 and 2020. The analysis is mainly focused on discussing patterns and recommendations for the architecture, hyperparameters and training settings of the network, while providing a detailed discussion of the results obtained by Autoencoders when compared to other state-of-the-art methods, and of the data contexts where they have been applied. The conclusions include a set of recommendations for the technical settings of the network, and show that Denoising Autoencoders outperform their competitors, particularly the often used statistical methods.

30 citations


Journal ArticleDOI
TL;DR: This work proves that Proportional Approval Voting can be computed in polynomial time for profiles that are single-peaked on a circle, and gives a fast recognition algorithm of this domain, provides a characterisation by finitely many forbidden subprofiles, and shows that many popular singleand multi-winner voting rules arePolynomial-time computable on this domain.
Abstract: We introduce the domain of preferences that are single-peaked on a circle, which is a generalization of the well-studied single-peaked domain. This preference restriction is useful, e.g., for scheduling decisions, certain facility location problems, and for one-dimensional decisions in the presence of extremist preferences. We give a fast recognition algorithm of this domain, provide a characterisation by finitely many forbidden subprofiles, and show that many popular singleand multi-winner voting rules are polynomial-time computable on this domain. In particular, we prove that Proportional Approval Voting can be computed in polynomial time for profiles that are single-peaked on a circle. In contrast, Kemeny’s rule remains hard to evaluate, and several impossibility results from social choice theory can be proved using only profiles in this domain.

Journal ArticleDOI
TL;DR: This paper proposes an algorithm, namely Sliding-Window Thompson Sampling (SW-TS), and empirically shows that SW-TS dramatically outperforms state-of-the-art algorithms even when the forms of non-stationarity are taken separately, as previously studied in the literature.
Abstract: Multi-Armed Bandit (MAB) techniques have been successfully applied to many classes of sequential decision problems in the past decades. However, non-stationary settings— very common in real-world applications—received little attention so far, and theoretical guarantees on the regret are known only for some frequentist algorithms. In this paper, we propose an algorithm, namely Sliding-Window Thompson Sampling (SW-TS), for nonstationary stochastic MAB settings. Our algorithm is based on Thompson Sampling and exploits a sliding-window approach to tackle, in a unified fashion, two different forms of non-stationarity studied separately so far: abruptly changing and smoothly changing. In the former, the reward distributions are constant during sequences of rounds, and their change may be arbitrary and happen at unknown rounds, while, in the latter, the reward distributions smoothly evolve over rounds according to unknown dynamics. Under mild assumptions, we provide regret upper bounds on the dynamic pseudo-regret of SW-TS for the abruptly changing environment, for the smoothly changing one, and for the setting in which both the non-stationarity forms are present. Furthermore, we empirically show that SW-TS dramatically outperforms state-of-the-art algorithms even when the forms of non-stationarity are taken separately, as previously studied in the literature.

Journal ArticleDOI
TL;DR: It is shown that, next to the risks of setbacks and being reprimanded for unsafe behaviour, the time-scale in which domain supremacy can be achieved plays a crucial role, and that imposing regulations for all risk and timing conditions may not have the anticipated effect.
Abstract: Rapid technological advancements in Artificial Intelligence (AI), as well as the growing deployment of intelligent technologies in new application domains, have generated serious anxiety and a fear of missing out among different stake-holders, fostering a racing narrative. Whether real or not, the belief in such a race for domain supremacy through AI, can make it real simply from its consequences, as put forward by the Thomas theorem. These consequences may be negative, as racing for technological supremacy creates a complex ecology of choices that could push stake-holders to underestimate or even ignore ethical and safety procedures. As a consequence, different actors are urging to consider both the normative and social impact of these technological advancements, contemplating the use of the precautionary principle in AI innovation and research. Yet, given the breadth and depth of AI and its advances, it is difficult to assess which technology needs regulation and when. As there is no easy access to data describing this alleged AI race, theoretical models are necessary to understand its potential dynamics, allowing for the identification of when procedures need to be put in place to favour outcomes beneficial for all. We show that, next to the risks of setbacks and being reprimanded for unsafe behaviour, the time-scale in which domain supremacy can be achieved plays a crucial role. When this can be achieved in a short term, those who completely ignore the safety precautions are bound to win the race but at a cost to society, apparently requiring regulatory actions. Our analysis reveals that imposing regulations for all risk and timing conditions may not have the anticipated effect as only for specific conditions a dilemma arises between what is individually preferred and globally beneficial. Similar observations can be made for the long-term development case. Yet different from the short-term situation, conditions can be identified that require c ©2020 AI Access Foundation. All rights reserved. Han, Pereira, Santos, & Lenaerts the promotion of risk-taking as opposed to compliance with safety regulations in order to improve social welfare. These results remain robust both when two or several actors are involved in the race and when collective rather than individual setbacks are produced by risk-taking behaviour. When defining codes of conduct and regulatory policies for applications of AI, a clear understanding of the time-scale of the race is thus required, as this may induce important non-trivial effects.


Journal ArticleDOI
TL;DR: This article introduces two novel progression algorithms that avoid unnecessary branching when the problem at hand is partially ordered and shows that both are sound and complete and introduces a method to apply arbitrary classical planning heuristics to guide the search in HTN planning.
Abstract: The majority of search-based HTN planning systems can be divided into those searching a space of partial plans (a plan space) and those performing progression search, i.e., that build the solution in a forward manner. So far, all HTN planners that guide the search by using heuristic functions are based on plan space search. Those systems represent the set of search nodes more effectively by maintaining a partial ordering between tasks, but they have only limited information about the current state during search. In this article, we propose the use of progression search as basis for heuristic HTN planning systems. Such systems can calculate their heuristics incorporating the current state, because it is tracked during search. Our contribution is the following: We introduce two novel progression algorithms that avoid unnecessary branching when the problem at hand is partially ordered and show that both are sound and complete. We show that defining systematicity is problematic for search in HTN planning, propose a definition, and show that it is fulfilled by one of our algorithms. Then, we introduce a method to apply arbitrary classical planning heuristics to guide the search in HTN planning. It relaxes the HTN planning model to a classical model that is only used for calculating heuristics. It is updated during search and used to create heuristic values that are used to guide the HTN search. We show that it can be used to create HTN heuristics with interesting theoretical properties like safety, goal-awareness, and admissibility. Our empirical evaluation shows that the resulting system outperforms the state of the art in search-based HTN planning.

Journal ArticleDOI
TL;DR: This work proposes a greedy algorithm to generate orders and shows how to use hill-climbing search to optimize a given order, which leads to significantly better heuristic estimates than using the best random order that is generated in the same time.
Abstract: Cost partitioning is a method for admissibly combining a set of admissible heuristic estimators by distributing operator costs among the heuristics. Computing an optimal cost partitioning, i.e., the operator cost distribution that maximizes the heuristic value, is often prohibitively expensive to compute. Saturated cost partitioning is an alternative that is much faster to compute and has been shown to yield high-quality heuristics. However, its greedy nature makes it highly susceptible to the order in which the heuristics are considered. We propose a greedy algorithm to generate orders and show how to use hill-climbing search to optimize a given order. Combining both techniques leads to significantly better heuristic estimates than using the best random order that is generated in the same time. Since there is often no single order that gives good guidance on the whole state space, we use the maximum of multiple orders as a heuristic that is significantly better informed than any single-order heuristic, especially when we actively search for a set of diverse orders.

Journal ArticleDOI
TL;DR: This work investigates asking judges to provide a specific form of rationale supporting each rating decision and suggests a win-win approach on an information retrieval task in which human judges rate the relevance of Web pages for different search topics.
Abstract: When collecting item ratings from human judges, it can be difficult to measure and enforce data quality due to task subjectivity and lack of transparency into how judges make each rating decision. To address this, we investigate asking judges to provide a specific form of rationale supporting each rating decision. We evaluate this approach on an information retrieval task in which human judges rate the relevance of Web pages for different search topics. Cost-benefit analysis over 10,000 judgments collected on Amazon’s Mechanical Turk suggests a win-win. Firstly, rationales yield a multitude of benefits: more reliable judgments, greater transparency for evaluating both human raters and their judgments, reduced need for expert gold, the opportunity for dual-supervision from ratings and rationales, and added value from the rationales themselves. Secondly, once experienced in the task, crowd workers provide rationales with almost no increase in task completion time. Consequently, we can realize the above benefits with minimal additional cost.

Journal ArticleDOI
TL;DR: A general privacy-preserving framework for Variational Bayes (VB), a widely used optimization-based Bayesian inference method, that respects differential privacy, the gold-standard privacy criterion, and encompasses a large class of probabilistic models, called the Conjugate Exponential (CE) family.
Abstract: Many applications of Bayesian data analysis involve sensitive information, motivating methods which ensure that privacy is protected. We introduce a general privacy-preserving framework for Variational Bayes (VB), a widely used optimization-based Bayesian inference method. Our framework respects differential privacy, the gold-standard privacy criterion, and encompasses a large class of probabilistic models, called the Conjugate Exponential (CE) family. We observe that we can straightforwardly privatise VB's approximate posterior distributions for models in the CE family, by perturbing the expected sufficient statistics of the complete-data likelihood. For a broadly-used class of non-CE models, those with binomial likelihoods, we show how to bring such models into the CE family, such that inferences in the modified model resemble the private variational Bayes algorithm as closely as possible, using the Polya-Gamma data augmentation scheme. The iterative nature of variational Bayes presents a further challenge since iterations increase the amount of noise needed. We overcome this by combining: (1) an improved composition method for differential privacy, called the moments accountant, which provides a tight bound on the privacy cost of multiple VB iterations and thus significantly decreases the amount of additive noise; and (2) the privacy amplification effect of subsampling mini-batches from large-scale data in stochastic learning. We empirically demonstrate the effectiveness of our method in CE and non-CE models including latent Dirichlet allocation, Bayesian logistic regression, and sigmoid belief networks, evaluated on real-world datasets.

Journal ArticleDOI
TL;DR: In this article, regret bounds for non-episodic Markov decision processes with an optimal dependence on all given parameters were derived for uniformly ergodic decision processes, and they were improved by using an alternative mixing time parameter.
Abstract: We give a simple optimistic algorithm for which it is easy to derive regret bounds of $\tilde{O}(\sqrt{t_{\rm mix} SAT})$ after $T$ steps in uniformly ergodic Markov decision processes with $S$ states, $A$ actions, and mixing time parameter $t_{\rm mix}$. These bounds are the first regret bounds in the general, non-episodic setting with an optimal dependence on all given parameters. They could only be improved by using an alternative mixing time parameter.

Journal ArticleDOI
TL;DR: Novel definitions of fairness concepts in terms of market prices are presented, and a new scheme to round a market equilibrium into an integral allocation in a way that provides most of the fairness properties of an integral max NSW allocation is designed.
Abstract: We consider the problem of fairly allocating a set of indivisible goods among n agents. Various fairness notions have been proposed within the rapidly growing field of fair division, but the Nash social welfare (NSW) serves as a focal point. In part, this follows from the ‘unreasonable’ fairness guarantees provided, in the sense that a max NSW allocation meets multiple other fairness metrics simultaneously, all while satisfying a standard economic concept of efficiency, Pareto optimality. However, existing approximation algorithms fail to satisfy all of the remarkable fairness guarantees offered by a max NSW allocation, instead targeting only the specific NSW objective. We address this issue by presenting a 2 max NSW, Prop-1, 1/(2n) MMS, and Pareto optimal allocation in strongly polynomial time. Our techniques are based on a market interpretation of a fractional max NSW allocation. We present novel definitions of fairness concepts in terms of market prices, and design a new scheme to round a market equilibrium into an integral allocation in a way that provides most of the fairness properties of an integral max NSW allocation.

Journal ArticleDOI
TL;DR: This paper envisions a paradigm shift, where AI technologies are brought to the side of consumers and their organizations, with the aim of building an efficient and effective counter-power.
Abstract: Recent years have been tainted by market practices that continuously expose us, as consumers, to new risks and threats. We have become accustomed, and sometimes even resigned, to businesses monitoring our activities, examining our data, and even meddling with our choices. Artificial Intelligence (AI) is often depicted as a weapon in the hands of businesses and blamed for allowing this to happen. In this paper, we envision a paradigm shift, where AI technologies are brought to the side of consumers and their organizations, with the aim of building an efficient and effective counter-power. AI-powered tools can support a massive-scale automated analysis of textual and audiovisual data, as well as code, for the benefit of consumers and their organizations. This in turn can lead to a better oversight of business activities, help consumers exercise their rights, and enable c ©2020 AI Access Foundation. All rights reserved. Lippi, Contissa, Jab lonowska, Lagioia, Micklitz, Pa lka, Sartor, & Torroni the civil society to mitigate information overload. We discuss the societal, political, and technological challenges that stand before that vision.

Journal ArticleDOI
TL;DR: The Bottleneck Simulator as discussed by the authors is a model-based reinforcement learning method which combines a learned, factorized transition model of the environment with rollout simulations to learn an effective policy from few examples.
Abstract: Deep reinforcement learning has recently shown many impressive successes. However, one major obstacle towards applying such methods to real-world problems is their lack of data-efficiency. To this end, we propose the Bottleneck Simulator: a model-based reinforcement learning method which combines a learned, factorized transition model of the environment with rollout simulations to learn an effective policy from few examples. The learned transition model employs an abstract, discrete (bottleneck) state, which increases sample efficiency by reducing the number of model parameters and by exploiting structural properties of the environment. We provide a mathematical analysis of the Bottleneck Simulator in terms of fixed points of the learned policy, which reveals how performance is affected by four distinct sources of error: an error related to the abstract space structure, an error related to the transition model estimation variance, an error related to the transition model estimation bias, and an error related to the transition model class bias. Finally, we evaluate the Bottleneck Simulator on two natural language processing tasks: a text adventure game and a real-world, complex dialogue response selection task. On both tasks, the Bottleneck Simulator yields excellent performance beating competing approaches.

Journal ArticleDOI
TL;DR: A result of these reductions is that QNPs are shown to have the same expressive power and the same complexity as FOND problems, and to be able to check the number of solutions of an associated fully observable non-deterministic problem that terminate.
Abstract: Qualitative numerical planning is classical planning extended with non-negative real variables that can be increased or decreased "qualitatively", i.e., by positive indeterminate amounts. While deterministic planning with numerical variables is undecidable in general, qualitative numerical planning is decidable and provides a convenient abstract model for generalized planning. The solutions to qualitative numerical problems (QNPs) were shown to correspond to the strong cyclic solutions of an associated fully observable non-deterministic (FOND) problem that terminate. This leads to a generate-and-test algorithm for solving QNPs where solutions to a FOND problem are generated one by one and tested for termination. The computational shortcomings of this approach for solving QNPs, however, are that it is not simple to amend FOND planners to generate all solutions, and that the number of solutions to check can be doubly exponential in the number of variables. In this work we address these limitations while providing additional insights on QNPs. More precisely, we introduce two polynomial-time reductions, one from QNPs to FOND problems and the other from FOND problems to QNPs both of which do not involve termination tests. A result of these reductions is that QNPs are shown to have the same expressive power and the same complexity as FOND problems.

Journal ArticleDOI
TL;DR: Novel subgoaling relaxations for automated planning with propositional and numeric state variables and numeric conditions are studied, with the aim of reaching a good trade-off between accuracy and computation costs within a heuristic state-space search planner.
Abstract: This paper studies novel subgoaling relaxations for automated planning with propositional and numeric state variables. Subgoaling relaxations address one source of complexity of the planning problem: the requirement to satisfy conditions simultaneously. The core idea is to relax this requirement by recursively decomposing conditions into atomic subgoals that are considered in isolation. Such relaxations are typically used for pruning, or as the basis for computing admissible or inadmissible heuristic estimates to guide optimal or satisficing heuristic search planners. In the last decade or so, the subgoaling principle has underpinned the design of an abundance of relaxation-based heuristics whose formulations have greatly extended the reach of classical planning. This paper extends subgoaling relaxations to support numeric state variables and numeric conditions. We provide both theoretical and practical results, with the aim of reaching a good trade-off between accuracy and computation costs within a heuristic state-space search planner. Our experimental results validate the theoretical assumptions, and indicate that subgoaling substantially improves on the state of the art in optimal and satisficing numeric planning via forward state-space search.

Journal ArticleDOI
TL;DR: The interaction between reward and prediction learners is discussed and the importance of introspective prediction learners are highlighted: those that increase their rate of learning when progress is possible, and decrease when it is not.
Abstract: Learning about many things can provide numerous benefits to a reinforcement learning system. For example, learning many auxiliary value functions, in addition to optimizing the environmental reward, appears to improve both exploration and representation learning. The question we tackle in this paper is how to sculpt the stream of experience—how to adapt the learning system’s behavior—to optimize the learning of a collection of value functions. A simple answer is to compute an intrinsic reward based on the statistics of each auxiliary learner, and use reinforcement learning to maximize that intrinsic reward. Unfortunately, implementing this simple idea has proven difficult, and thus has been the focus of decades of study. It remains unclear which of the many possible measures of learning would work well in a parallel learning setting where environmental reward is extremely sparse or absent. In this paper, we investigate and compare different intrinsic reward mechanisms in a new bandit-like parallel-learning testbed. We discuss the interaction between reward and prediction learners and highlight the importance of introspective prediction learners: those that increase their rate of learning when progress is possible, and decrease when it is not. We provide a comprehensive empirical comparison of 14 different rewards, including well-known ideas from reinforcement learning and active learning. Our results highlight a simple but seemingly powerful principle: intrinsic rewards based on the amount of learning can generate useful behavior, if each individual learner is introspective.

Journal ArticleDOI
TL;DR: It is shown that a mechanism based on gold questions, based on the use of a small set of tasks for which the requester knows the correct answer, is prone to an attack carried out by a group of colluding crowdworkers that is easy to implement and deploy.
Abstract: Crowdsourcing is a popular methodology to collect manual labels at scale. Such labels are often used to train AI models and, thus, quality control is a key aspect in the process. One of the most popular quality assurance mechanisms in paid micro-task crowdsourcing is based on gold questions: the use of a small set of tasks for which the requester knows the correct answer and, thus, is able to directly assess crowdwork quality. In this paper, we show that such a mechanism is prone to an attack carried out by a group of colluding crowdworkers that is easy to implement and deploy: the inherent size limit of the gold set can be exploited by building an inferential system to detect which parts of the job are more likely to be gold questions. The described attack is robust to various forms of randomisation and programmatic generation of gold questions. We present the architecture of the proposed system, composed of a browser plug-in and an external server used to share information, and briefly introduce its potential evolution to a decentralised implementation. We implement and experimentally validate the gold question detection system, using real-world data from a popular crowdsourcing platform. Our experimental results show that crowdworkers using the proposed system spend more time on signalled gold questions but do not neglect the others thus achieving an increased overall work quality. Finally, we discuss the economic and sociological implications of this kind of attack.

Journal ArticleDOI
TL;DR: A new modular structure, called atomic decomposition (AD), is developed, which is based on modules that provide strong logical properties, such as locality-based modules, and a survey of existing decomposition approaches is provided.
Abstract: With the growth of ontologies used in diverse application areas, the need for module extraction and modularisation techniques has risen. The notion of the modular structure of an ontology, which comprises a suitable set of base modules together with their logical dependencies, has the potential to help users and developers in comprehending, sharing, and maintaining an ontology. We have developed a new modular structure, called atomic decomposition (AD), which is based on modules that provide strong logical properties, such as locality-based modules. In this article, we present the theoretical foundations of AD, review its logical and computational properties, discuss its suitability as a modular structure, and report on an experimental evaluation of AD. In addition, we discuss the concept of a modular structure in ontology engineering and provide a survey of existing decomposition approaches.

Journal ArticleDOI
TL;DR: Despite the simplicity of the graph, the problem of maximin share allocations of goods on a cycle turns out to be significantly harder than its tree version.
Abstract: The problem of fair division of indivisible goods is a fundamental problem of resource allocation in multi-agent systems, also studied extensively in social choice. Recently, the problem was generalized to the case when goods form a graph and the goal is to allocate goods to agents so that each agent’s bundle forms a connected subgraph. For the maximin share fairness criterion, researchers proved that if goods form a tree, an allocation offering each agent a bundle of at least her maximin share value always exists. Moreover, it can be found in polynomial time. In this paper we consider the problem of maximin share allocations of goods on a cycle. Despite the simplicity of the graph, the problem turns out to be significantly harder than its tree version. We present cases when maximin share allocations of goods on cycles exist and provide in this case results on allocations guaranteeing each agent a certain fraction of her maximin share. We also study algorithms for computing maximin share allocations of goods on cycles.

Journal ArticleDOI
TL;DR: In this paper, an algorithm for computing pure-strategy e-Bayes-Nash equilibria (e-BNEs) in combinatorial auctions is presented.
Abstract: We present a new algorithm for computing pure-strategy e-Bayes-Nash equilibria (e-BNEs) in combinatorial auctions. The main innovation of our algorithm is to separate the algorithm’s search phase (for finding the e-BNE) from the verification phase (for computing the e). Using this approach, we obtain an algorithm that is both very fast and provides theoretical guarantees on the e it finds. Our main contribution is a verification method which, surprisingly, allows us to upper bound the e across the whole continuous value space without making assumptions about the mechanism. Using our algorithm, we can now compute e-BNEs in multi-minded domains that are significantly more complex than what was previously possible to solve. We release our code under an open-source license to enable researchers to perform algorithmic analyses of auctions, to enable bidders to analyze different strategies, and many other applications.

Journal ArticleDOI
TL;DR: This work proposes a new representation setting for hedonic games, where each agent partitions the set of other agents into friends, enemies, and neutral agents, with friends and enemies being ranked.
Abstract: We propose a new representation setting for hedonic games, where each agent partitions the set of other agents into friends, enemies, and neutral agents, with friends and enemies being ranked. Under the assumption that preferences are monotonic (respectively, antimonotonic) with respect to the addition of friends (respectively, enemies), we propose a bipolar extension of the responsive extension principle, and use this principle to derive the (partial) preferences of agents over coalitions. Then, for a number of solution concepts, we characterize partitions that necessarily or possibly satisfy them, and we study the related problems in terms of their complexity.