scispace - formally typeset
Search or ask a question

Showing papers in "arXiv: Computer Science and Game Theory in 2021"


Book ChapterDOI
TL;DR: A mathematical model for participatory budgeting is presented, which charts existing models across different axes including whether the projects are treated as “divisible’ or “indivisible” and whether there are funding limits on individual projects.
Abstract: Participatory budgeting is a democratic approach to deciding the funding of public projects, which has been adopted in many cities across the world. We present a survey of research on participatory budgeting emerging from the computational social choice literature, which draws ideas from computer science and microeconomic theory. We present a mathematical model for participatory budgeting, which charts existing models across different axes including whether the projects are treated as “divisible” or “indivisible” and whether there are funding limits on individual projects. We then survey various approaches and methods from the literature, giving special emphasis on issues of preference elicitation, welfare objectives, fairness axioms, and voter incentives. Finally, we discuss several directions in which research on participatory budgeting can be extended in the future.

85 citations


Posted Content
TL;DR: The empirical results validate the theoretical findings and show that both the welfare and revenue can be improved by selecting the weight of the boosts properly.
Abstract: Auto-bidding has become one of the main options for bidding in online advertisements, in which advertisers only need to specify high-level objectives and leave the complex task of bidding to auto-bidders. In this paper, we propose a family of auctions with boosts to improve welfare in auto-bidding environments with both return on ad spend constraints and budget constraints. Our empirical results validate our theoretical findings and show that both the welfare and revenue can be improved by selecting the weight of the boosts properly.

32 citations


Posted Content
TL;DR: This paper studies the problem of fairly allocating a set of indivisible goods among n agents with additive valuations and shows the existence of 0.618-EFX allocations and that EFX allocation exists if the authors do not allocate at most n - 1 goods.
Abstract: We study the problem of fairly allocating a set of indivisible goods among $n$ agents with additive valuations. Envy-freeness up to any good (EFX) is arguably the most compelling fairness notion in this context. However, the existence of EFX allocations has not been settled and is one of the most important problems in fair division. Towards resolving this problem, many impressive results show the existence of its relaxations, e.g., the existence of $0.618$-EFX allocations, and the existence of EFX at most $n-1$ unallocated goods. The latter result was recently improved for three agents, in which the two unallocated goods are allocated through an involved procedure. Reducing the number of unallocated goods for arbitrary number of agents is a systematic way to settle the big question. In this paper, we develop a new approach, and show that for every $\varepsilon \in (0,1/2]$, there always exists a $(1-\varepsilon)$-EFX allocation with sublinear number of unallocated goods and high Nash welfare. For this, we reduce the EFX problem to a novel problem in extremal graph theory. We introduce the notion of rainbow cycle number $R(\cdot)$. For all $d \in \mathbb{N}$, $R(d)$ is the largest $k$ such that there exists a $k$-partite digraph $G =(\cup_{i \in [k]} V_i, E)$, in which 1) each part has at most $d$ vertices, i.e., $\lvert V_i \rvert \leq d$ for all $i \in [k]$, 2) for any two parts $V_i$ and $V_j$, each vertex in $V_i$ has an incoming edge from some vertex in $V_j$ and vice-versa, and 3) there exists no cycle in $G$ that contains at most one vertex from each part. We show that any upper bound on $R(d)$ directly translates to a sublinear bound on the number of unallocated goods. We establish a polynomial upper bound on $R(d)$, yielding our main result. Furthermore, our approach is constructive, which also gives a polynomial-time algorithm for finding such an allocation.

25 citations


Posted Content
TL;DR: In this paper, the authors consider the scenario in which a group of IoT devices are employed by the Metaverse platform to collect such data on behalf of virtual service providers (VSPs), and adopt hybrid evolutionary dynamics, in which heterogeneous device owners can employ different revision protocols to update their strategies.
Abstract: Spurred by the severe restrictions on mobility due to the COVID-19 pandemic, there is currently intense interest in developing the Metaverse, to offer virtual services/business online. A key enabler of such virtual service is the digital twin, i.e., a digital replication of real-world entities in the Metaverse, e.g., city twin, avatars, etc. The real-world data collected by IoT devices and sensors are key for synchronizing the two worlds. In this paper, we consider the scenario in which a group of IoT devices are employed by the Metaverse platform to collect such data on behalf of virtual service providers (VSPs). Device owners, who are self-interested, dynamically select a VSP to maximize rewards. We adopt hybrid evolutionary dynamics, in which heterogeneous device owner populations can employ different revision protocols to update their strategies. Extensive simulations demonstrate that a hybrid protocol can lead to evolutionary stable states.

20 citations


Posted Content
TL;DR: In this article, the authors provide a comprehensive review for the economic and game theoretic approaches proposed in the literature to design various schemes for stimulating data owners to participate in FL training process.
Abstract: Federated learning (FL) becomes popular and has shown great potentials in training large-scale machine learning (ML) models without exposing the owners' raw data. In FL, the data owners can train ML models based on their local data and only send the model updates rather than raw data to the model owner for aggregation. To improve learning performance in terms of model accuracy and training completion time, it is essential to recruit sufficient participants. Meanwhile, the data owners are rational and may be unwilling to participate in the collaborative learning process due to the resource consumption. To address the issues, there have been various works recently proposed to motivate the data owners to contribute their resources. In this paper, we provide a comprehensive review for the economic and game theoretic approaches proposed in the literature to design various schemes for stimulating data owners to participate in FL training process. In particular, we first present the fundamentals and background of FL, economic theories commonly used in incentive mechanism design. Then, we review applications of game theory and economic approaches applied for incentive mechanisms design of FL. Finally, we highlight some open issues and future research directions concerning incentive mechanism design of FL.

16 citations


Posted Content
TL;DR: In this article, a learning-based incentive mechanism framework for VR services in the Metaverse is proposed, where the quality of perception is the metric for VR users immersing in the virtual world, and the double Dutch auction mechanism is used to determine optimal pricing and allocation rules in this market.
Abstract: The Metaverse is regarded as the next-generation Internet paradigm that allows humans to play, work, and socialize in an alternative virtual world with immersive experience, for instance, via head-mounted display for Virtual Reality (VR) rendering. With the help of ubiquitous wireless connections and powerful edge computing technologies, VR users in wireless edge-empowered Metaverse can immerse in the virtual through the access of VR services offered by different providers. However, VR applications are computation- and communication-intensive. The VR service providers (SPs) have to optimize the VR service delivery efficiently and economically given their limited communication and computation resources. An incentive mechanism can be thus applied as an effective tool for managing VR services between providers and users. Therefore, in this paper, we propose a learning-based Incentive Mechanism framework for VR services in the Metaverse. First, we propose the quality of perception as the metric for VR users immersing in the virtual world. Second, for quick trading of VR services between VR users (i.e., buyers) and VR SPs (i.e., sellers), we design a double Dutch auction mechanism to determine optimal pricing and allocation rules in this market. Third, for auction communication reduction, we design a deep reinforcement learning-based auctioneer to accelerate this auction process. Experimental results demonstrate that the proposed framework can achieve near-optimal social welfare while reducing at least half of the auction information exchange cost than baseline methods.

14 citations


Proceedings ArticleDOI
TL;DR: In this paper, the authors introduce a dynamic allocation system based on two-stage stochastic programming to improve employment outcomes of resettled refugees, achieving over 98 percent of the hindsight-optimal employment compared to under 90 percent of current greedy-like approaches.
Abstract: Employment outcomes of resettled refugees depend strongly on where they are placed inside the host country. While the United States sets refugee capacities for communities on an annual basis, refugees arrive and must be placed over the course of the year. We introduce a dynamic allocation system based on two-stage stochastic programming to improve employment outcomes. Our algorithm is able to achieve over 98 percent of the hindsight-optimal employment compared to under 90 percent of current greedy-like approaches. This dramatic improvement persists even when we incorporate a vast array of practical features of the refugee resettlement process including indivisible families, batching, and uncertainty with respect to the number of future arrivals. Our algorithm is now part of the Annie MOORE optimization software used by a leading American refugee resettlement agency.

13 citations


Posted Content
TL;DR: In this article, a stochastic optimal resource allocation scheme (SORAS) is proposed to minimize the cost of the virtual service provider while accounting for the users' demands uncertainty.
Abstract: Dubbed as the next-generation Internet, the metaverse is a virtual world that allows users to interact with each other or objects in real-time using their avatars. The metaverse is envisioned to support novel ecosystems of service provision in an immersive environment brought about by an intersection of the virtual and physical worlds. The native AI systems in metaverse will personalized user experience over time and shape the experience in a scalable, seamless, and synchronous way. However, the metaverse is characterized by diverse resource types amid a highly dynamic demand environment. In this paper, we propose the case study of virtual education in the metaverse and address the unified resource allocation problem amid stochastic user demand. We propose a stochastic optimal resource allocation scheme (SORAS) based on stochastic integer programming with the objective of minimizing the cost of the virtual service provider. The simulation results show that SORAS can minimize the cost of the virtual service provider while accounting for the users' demands uncertainty.

12 citations


Posted Content
TL;DR: This paper proposes PreferenceNet, an extension of existing neural-network-based auction mechanisms to encode constraints using (potentially human-provided) exemplars of desirable allocations and introduces a new metric to evaluate an auction allocations’ adherence to such socially desirable constraints.
Abstract: The design of optimal auctions is a problem of interest in economics, game theory and computer science. Despite decades of effort, strategyproof, revenue-maximizing auction designs are still not known outside of restricted settings. However, recent methods using deep learning have shown some success in approximating optimal auctions, recovering several known solutions and outperforming strong baselines when optimal auctions are not known. In addition to maximizing revenue, auction mechanisms may also seek to encourage socially desirable constraints such as allocation fairness or diversity. However, these philosophical notions neither have standardization nor do they have widely accepted formal definitions. In this paper, we propose PreferenceNet, an extension of existing neural-network-based auction mechanisms to encode constraints using (potentially human-provided) exemplars of desirable allocations. In addition, we introduce a new metric to evaluate an auction allocations' adherence to such socially desirable constraints and demonstrate that our proposed method is competitive with current state-of-the-art neural-network based auction designs. We validate our approach through human subject research and show that we are able to effectively capture real human preferences. Our code is available at this https URL

12 citations


Posted Content
TL;DR: This paper presented a $380$-approximation algorithm for the Nash Social Welfare problem with submodular valuations, which builds on and extends a recent constant-factor approximation for Rado valuations.
Abstract: We present a $380$-approximation algorithm for the Nash Social Welfare problem with submodular valuations. Our algorithm builds on and extends a recent constant-factor approximation for Rado valuations.

12 citations


Book ChapterDOI
TL;DR: In this article, the Gibbard-Satterthwaite theorem was revisited and several voting rules including k-approval were shown to be not obvious manipulable.
Abstract: The Gibbard-Satterthwaite theorem states that no unanimous and non-dictatorial voting rule is strategyproof. We revisit voting rules and consider a weaker notion of strategyproofness called not obvious manipulability that was proposed by Troyan and Morrill (2020). We identify several classes of voting rules that satisfy this notion. We also show that several voting rules including k-approval fail to satisfy this property. We characterize conditions under which voting rules are obviously manipulable. One of our insights is that certain rules are obviously manipulable when the number of alternatives is relatively large compared to the number of voters. In contrast to the Gibbard-Satterthwaite theorem, many of the rules we examined are not obviously manipulable. This reflects the relatively easier satisfiability of the notion and the zero information assumption of not obvious manipulability, as opposed to the perfect information assumption of strategyproofness. We also present algorithmic results for computing obvious manipulations and report on experiments.

Posted Content
TL;DR: These contributions further the understanding of arena-independent finite-memory (AIFM) determinacy, i.e., the study of objectives for which memory is needed, but in a way that only depends on limited parameters of the game graphs.
Abstract: We study stochastic zero-sum games on graphs, which are prevalent tools to model decision-making in presence of an antagonistic opponent in a random environment. In this setting, an important question is the one of strategy complexity: what kinds of strategies are sufficient or required to play optimally (e.g., randomization or memory requirements)? Our contributions further the understanding of arena-independent finite-memory (AIFM) determinacy, i.e., the study of objectives for which memory is needed, but in a way that only depends on limited parameters of the game graphs. First, we show that objectives for which pure AIFM strategies suffice to play optimally also admit pure AIFM subgame perfect strategies. Second, we show that we can reduce the study of objectives for which pure AIFM strategies suffice in two-player stochastic games to the easier study of one-player stochastic games (i.e., Markov decision processes). Third, we characterize the sufficiency of AIFM strategies through two intuitive properties of objectives. This work extends a line of research started on deterministic games in [BLO+20] to stochastic ones. [BLO+20] Patricia Bouyer, Stephane Le Roux, Youssouf Oualhadj, Mickael Randour, and Pierre Vandenhove. Games Where You Can Play Optimally with Arena-Independent Finite Memory. CONCUR 2020.

Proceedings ArticleDOI
TL;DR: In this paper, the seller's problem of designing a revenue-optimal pricing scheme to sell her information to the buyer was studied, where the seller can access the realized state of nature and this information is useful for the information buyer to better estimate his payoff from the active action.
Abstract: A decision maker looks to take an active action (e.g., purchase some goods or make an investment). The payoff of this active action depends on his own private type as well as a random and unknown state of nature. To decide between this active action and another passive action, which always leads to a safe constant utility, the decision maker may purchase information from an information seller. The seller can access the realized state of nature, and this information is useful for the decision maker (i.e., the information buyer) to better estimate his payoff from the active action. We study the seller's problem of designing a revenue-optimal pricing scheme to sell her information to the buyer. Suppose the buyer's private type and the state of nature are drawn from two independent distributions, we fully characterize the optimal pricing mechanism for the seller in closed form. Specifically, under a natural linearity assumption of the buyer payoff function, we show that an optimal pricing mechanism is the threshold mechanism which charges each buyer type some upfront payment and then reveals whether the realized state is above some threshold or below it. The payment and the threshold are generally different for different buyer types, and are carefully tailored to accommodate the different amount of risks each buyer type can take. The proof of our results relies on novel techniques and concepts, such as upper/lower virtual values and their mixtures, which may be of independent interest.

Posted Content
TL;DR: The existence of EFX allocations is a major open problem in fair division, even for additive valuations, and it is known that EFX exists if one can leave $n-1$ items unallocated, where $n is the number of agents.
Abstract: The existence of EFX allocations is a major open problem in fair division, even for additive valuations. The current state of the art is that no setting where EFX allocations are impossible is known, and EFX is known to exist for ($i$) agents with identical valuations, ($ii$) 2 agents, ($iii$) 3 agents with additive valuations, ($iv$) agents with one of two additive valuations and ($v$) agents with two-valued instances. It is also known that EFX exists if one can leave $n-1$ items unallocated, where $n$ is the number of agents. We develop new techniques that allow us to push the boundaries of the enigmatic EFX problem beyond these known results, and, arguably, to simplify proofs of earlier results. Our main results are ($i$) every setting with 4 additive agents admits an EFX allocation that leaves at most a single item unallocated, ($ii$) every setting with $n$ additive valuations has an EFX allocation with at most $n-2$ unallocated items. Moreover, all of our results extend beyond additive valuations to all nice cancelable valuations (a new class, including additive, unit-demand, budget-additive and multiplicative valuations, among others). Furthermore, using our new techniques, we show that previous results for additive valuations extend to nice cancelable valuations.

Posted Content
TL;DR: In this paper, the authors consider the problem of heterogeneous facility location with limited resources and study mechanisms that aim to maximize the social welfare under the constraint of incentivizing the agents to truthfully report their positions and preferences.
Abstract: We initiate the study of the heterogeneous facility location problem with limited resources. We mainly focus on the fundamental case where a set of agents are positioned in the line segment [0,1] and have approval preferences over two available facilities. A mechanism takes as input the positions and the preferences of the agents, and chooses to locate a single facility based on this information. We study mechanisms that aim to maximize the social welfare (the total utility the agents derive from facilities they approve), under the constraint of incentivizing the agents to truthfully report their positions and preferences. We consider three different settings depending on the level of agent-related information that is public or private. For each setting, we design deterministic and randomized strategyproof mechanisms that achieve a good approximation of the optimal social welfare, and complement these with nearly-tight impossibility results.

Proceedings ArticleDOI
TL;DR: In this paper, a deep distribution network for optimal bidding in both open (non-censored) and closed (censored)-online first-price auctions was proposed, which outperforms previous state-of-the-art algorithms in terms of both surplus and effective cost per action (eCPX) metrics.
Abstract: Since 2019, most ad exchanges and sell-side platforms (SSPs), in the online advertising industry, shifted from second to first price auctions. Due to the fundamental difference between these auctions, demand-side platforms (DSPs) have had to update their bidding strategies to avoid bidding unnecessarily high and hence overpaying. Bid shading was proposed to adjust the bid price intended for second-price auctions, in order to balance cost and winning probability in a first-price auction setup. In this study, we introduce a novel deep distribution network for optimal bidding in both open (non-censored) and closed (censored) online first-price auctions. Offline and online A/B testing results show that our algorithm outperforms previous state-of-art algorithms in terms of both surplus and effective cost per action (eCPX) metrics. Furthermore, the algorithm is optimized in run-time and has been deployed into VerizonMedia DSP as production algorithm, serving hundreds of billions of bid requests per day. Online A/B test shows that advertiser's ROI are improved by +2.4%, +2.4%, and +8.6% for impression based (CPM), click based (CPC), and conversion based (CPA) campaigns respectively.

Posted Content
TL;DR: In this article, it was shown that computing an approximate Markov perfect equilibrium (MPE) in a finite-state discounted stochastic game within the exponential precision is PPAD-complete.
Abstract: Similar to the role of Markov decision processes in reinforcement learning, Stochastic Games (SGs) lay the foundation for the study of multi-agent reinforcement learning (MARL) and sequential agent interactions. In this paper, we derive that computing an approximate Markov Perfect Equilibrium (MPE) in a finite-state discounted Stochastic Game within the exponential precision is \textbf{PPAD}-complete. We adopt a function with a polynomially bounded description in the strategy space to convert the MPE computation to a fixed-point problem, even though the stochastic game may demand an exponential number of pure strategies, in the number of states, for each agent. The completeness result follows the reduction of the fixed-point problem to {\sc End of the Line}. Our results indicate that finding an MPE in SGs is highly unlikely to be \textbf{NP}-hard unless \textbf{NP}=\textbf{co-NP}. Our work offers confidence for MARL research to study MPE computation on general-sum SGs and to develop fruitful algorithms as currently on zero-sum SGs.

Posted Content
TL;DR: The performance of a natural selection mechanism that is called approval voting with default (AVD) is analyzed and it is shown that it achieves a O( √ = ln=) additive guarantee for opinion poll and aO(ln2 =) for a priori popularity inputs, where = is the number of individuals.
Abstract: We study the problem of {\em impartial selection}, a topic that lies at the intersection of computational social choice and mechanism design. The goal is to select the most popular individual among a set of community members. The input can be modeled as a directed graph, where each node represents an individual, and a directed edge indicates nomination or approval of a community member to another. An {\em impartial mechanism} is robust to potential selfish behavior of the individuals and provides appropriate incentives to voters to report their true preferences by ensuring that the chance of a node to become a winner does not depend on its outgoing edges. The goal is to design impartial mechanisms that select a node with an in-degree that is as close as possible to the highest in-degree. We measure the efficiency of such a mechanism by the difference of these in-degrees, known as its {\em additive} approximation. In particular, we study the extent to which prior information on voters' preferences could be useful in the design of efficient deterministic impartial selection mechanisms with good additive approximation guarantees. We consider three models of prior information, which we call the {\em opinion poll}, the {\em a prior popularity}, and the {\em uniform} model. We analyze the performance of a natural selection mechanism that we call {\em approval voting with default} (AVD) and show that it achieves a $O(\sqrt{n\ln{n}})$ additive guarantee for opinion poll and a $O(\ln^2n)$ for a priori popularity inputs, where $n$ is the number of individuals. We consider this polylogarithmic bound as our main technical contribution. We complement this last result by showing that our analysis is close to tight, showing an $\Omega(\ln{n})$ lower bound. This holds in the uniform model, which is the simplest among the three models.

Posted Content
TL;DR: In this paper, the authors employ a hypergame framework to analyze the single-leader-multiple-followers (SLMF) Stackelberg security game with two typical misinformed situations: misperception and deception.
Abstract: In this paper, we employ a hypergame framework to analyze the single-leader-multiple-followers (SLMF) Stackelberg security game with two typical misinformed situations: misperception and deception. We provide a stability criterion with the help of hyper Nash equilibrium (HNE) to analyze both strategic stability and cognitive stability of equilibria in SLMF games with misinformation. To this end, we find mild stable conditions such that the equilibria with misperception and deception can derive HNE. Moreover, we analyze the robustness of the equilibria to reveal whether the players have the ability to keep their profits.

Posted Content
TL;DR: Wang et al. as discussed by the authors modeled the interactions between the protocol designer, users, and miners as a three-stage Stackelberg game, and found that miners neglecting the negative externality in transaction selection cause they are willing to accept insufficient-fee transactions.
Abstract: Miners in a blockchain system are suffering from ever-increasing storage costs, which in general have not been properly compensated by the users' transaction fees. This reduces the incentives for the miners' participation and may jeopardize the blockchain security. We propose to mitigate this blockchain insufficient fee issue through a Fee and Waiting Tax (FWT) mechanism, which explicitly considers the two types of negative externalities in the system. Specifically, we model the interactions between the protocol designer, users, and miners as a three-stage Stackelberg game. By characterizing the equilibrium of the game, we find that miners neglecting the negative externality in transaction selection cause they are willing to accept insufficient-fee transactions. This leads to the insufficient storage fee issue in the existing protocol. Moreover, our proposed optimal FWT mechanism can motivate users to pay sufficient transaction fees to cover the storage costs and achieve the unconstrained social optimum. Numerical results show that the optimal FWT mechanism guarantees sufficient transaction fees and achieves an average social welfare improvement of 33.73\% or more over the existing protocol. Furthermore, the optimal FWT mechanism achieves the maximum fairness index and performs well even under heterogeneous-storage-cost miners.

Posted Content
TL;DR: In this paper, an evolutionary game-theoretic framework was proposed to study the coupled evolutions of herd behaviors and epidemics, and the authors extended the classical degree-based mean field epidemic model over complex networks by coupling it with the evolutionary game dynamics.
Abstract: The recent COVID-19 pandemic has led to an increasing interest in the modeling and analysis of infectious diseases. The pandemic has made a significant impact on the way we behave and interact in our daily life. The past year has witnessed a strong interplay between human behaviors and epidemic spreading. In this paper, we propose an evolutionary game-theoretic framework to study the coupled evolutions of herd behaviors and epidemics. Our framework extends the classical degree-based mean-field epidemic model over complex networks by coupling it with the evolutionary game dynamics. The statistically equivalent individuals in a population choose their social activity intensities based on the fitness or the payoffs that depend on the state of the epidemics. Meanwhile, the spreading of the infectious disease over the complex network is reciprocally influenced by the players' social activities. We analyze the coupled dynamics by studying the stationary properties of the epidemic for a given herd behavior and the structural properties of the game for a given epidemic process. The decisions of the herd turn out to be strategic substitutes. We formulate an equivalent finite-player game and an equivalent network to represent the interactions among the finite populations. We develop structure-preserving approximation techniques to study time-dependent properties of the joint evolution of the behavioral and epidemic dynamics. The resemblance between the simulated coupled dynamics and the real COVID-19 statistics in the numerical experiments indicates the predictive power of our framework.

Journal ArticleDOI
TL;DR: In this paper, an iterative auction game among network slice tenants and a plurality of price-taking subnet service providers is proposed to solve the problem of inter-domain resource provisioning to network slices in an on-demand fashion.
Abstract: Network slicing is emerging as a promising method to provide sought-after versatility and flexibility to cope with ever-increasing demands. To realize such potential advantages and to meet the challenging requirements of various network slices in an on-demand fashion, we need to develop an agile and distributed mechanism for resource provisioning to different network slices in a heterogeneous multi-resource multi-domain mobile network environment. We formulate inter-domain resource provisioning to network slices in such an environment as an optimization problem which maximizes social welfare among network slice tenants (so that maximizing tenants' satisfaction), while minimizing operational expenditures for infrastructure service providers at the same time. To solve the envisioned problem, we implement an iterative auction game among network slice tenants, on one hand, and a plurality of price-taking subnet service providers, on the other hand. We show that the proposed solution method results in a distributed privacy-saving mechanism which converges to the optimal solution of the described optimization problem. In addition to providing analytical results to characterize the performance of the proposed mechanism, we also employ numerical evaluations to validate the results, demonstrate convergence of the presented algorithm, and show the enhanced performance of the proposed approach (in terms of resource utilization, fairness and operational costs) against the existing solutions.

Posted Content
TL;DR: In this paper, the authors consider the problem of allocating indivisible goods to agents with additive valuation functions and show that there is no negative example in which the difference between the number of items and number of agents is smaller than six.
Abstract: We consider the problem of allocating indivisible goods to agents with additive valuation functions. Kurokawa, Procaccia and Wang [JACM, 2018] present instances for which every allocation gives some agent less than her maximin share. We present such examples with larger gaps. For three agents and nine items, we design an instance in which at least one agent does not get more than a $\frac{39}{40}$ fraction of her maximin share. Moreover, we show that there is no negative example in which the difference between the number of items and the number of agents is smaller than six, and that the gap (of $\frac{1}{40}$) of our example is worst possible among all instances with nine items. For $n \ge 4$ agents, we show examples in which at least one agent does not get more than a $1 - \frac{1}{n^4}$ fraction of her maximin share. In the instances designed by Kurokawa, Procaccia and Wang, the gap is exponentially small in $n$.

Posted Content
TL;DR: In this paper, the authors develop a radically uncoupled Q-learning dynamics that converges to the best response to the opponent's strategy when the opponent follows an asymptotically stationary strategy; the value function estimates converge to the payoffs at a Nash equilibrium when both agents adopt the dynamics.
Abstract: We study multi-agent reinforcement learning (MARL) in infinite-horizon discounted zero-sum Markov games. We focus on the practical but challenging setting of decentralized MARL, where agents make decisions without coordination by a centralized controller, but only based on their own payoffs and local actions executed. The agents need not observe the opponent's actions or payoffs, possibly being even oblivious to the presence of the opponent, nor be aware of the zero-sum structure of the underlying game, a setting also referred to as radically uncoupled in the literature of learning in games. In this paper, we develop for the first time a radically uncoupled Q-learning dynamics that is both rational and convergent: the learning dynamics converges to the best response to the opponent's strategy when the opponent follows an asymptotically stationary strategy; the value function estimates converge to the payoffs at a Nash equilibrium when both agents adopt the dynamics. The key challenge in this decentralized setting is the non-stationarity of the learning environment from an agent's perspective, since both her own payoffs and the system evolution depend on the actions of other agents, and each agent adapts their policies simultaneously and independently. To address this issue, we develop a two-timescale learning dynamics where each agent updates her local Q-function and value function estimates concurrently, with the latter happening at a slower timescale.

Posted Content
TL;DR: Simulation results show that the edge servers are incentivized to allocate more CPU power when multiple rewards are offered, i.e., there are multiple winners, instead of rewarding only the edge server with the largest CPU power allocation.
Abstract: Coded distributed computing (CDC) has emerged as a promising approach because it enables computation tasks to be carried out in a distributed manner while mitigating straggler effects, which often account for the long overall completion times. Specifically, by using polynomial codes, computed results from only a subset of edge servers can be used to reconstruct the final result. However, incentive issues have not been studied systematically for the edge servers to complete the CDC tasks. In this paper, we propose a tractable two-level game-theoretic approach to incentivize the edge servers to complete the CDC tasks. Specifically, in the lower level, a hedonic coalition formation game is formulated where the edge servers share their resources within their coalitions. By forming coalitions, the edge servers have more Central Processing Unit (CPU) power to complete the computation tasks. In the upper level, given the CPU power of the coalitions of edge servers, an all-pay auction is designed to incentivize the edge servers to participate in the CDC tasks. In the all-pay auction, the bids of the edge servers are represented by the allocation of their CPU power to the CDC tasks. The all-pay auction is designed to maximize the utility of the cloud server by determining the allocation of rewards to the winners. Simulation results show that the edge servers are incentivized to allocate more CPU power when multiple rewards are offered, i.e., there are multiple winners, instead of rewarding only the edge server with the largest CPU power allocation. Besides, the utility of the cloud server is maximized when it offers multiple homogeneous rewards, instead of heterogeneous rewards.

Posted Content
TL;DR: It is shown that for any game with unawareness there is a rationalizable discovery process that leads to a self-confirming game that possesses aSelf-Confirming equilibrium in extensive-form rationalizable strategies.
Abstract: Equilibrium notions for games with unawareness in the literature cannot be interpreted as steady-states of a learning process because players may discover novel actions during play. In this sense, many games with unawareness are "self-destroying" as a player's representation of the game may change after playing it once. We define discovery processes where at each state there is an extensive-form game with unawareness that together with the players' play determines the transition to possibly another extensive-form game with unawareness in which players are now aware of actions that they have discovered. A discovery process is rationalizable if players play extensive-form rationalizable strategies in each game with unawareness. We show that for any game with unawareness there is a rationalizable discovery process that leads to a self-confirming game that possesses a self-confirming equilibrium in extensive-form rationalizable conjectures. This notion of equilibrium can be interpreted as steady-state of both a discovery and learning process.

Proceedings ArticleDOI
TL;DR: In this article, the authors study the problem of finding a truthful mechanism that can compete with the overall longest path while incentivizing approximate truthfulness, i.e., requiring that hiding nodes cannot increase a player's utility by more than a factor of $1 + o(1).
Abstract: Motivated by kidney exchange, we study the following mechanism-design problem: On a directed graph (of transplant compatibilities among patient-donor pairs), the mechanism must select a simple path (a chain of transplantations) starting at a distinguished vertex (an altruistic donor) such that the total length of this path is as large as possible (a maximum number of patients receive a kidney). However, the mechanism does not have direct access to the graph. Instead, the vertices are partitioned over multiple players (hospitals), and each player reports a subset of her vertices to the mechanism. In particular, a player may strategically omit vertices to increase how many of her vertices lie on the path returned by the mechanism. Our objective is to find mechanisms that limit incentives for such manipulation while producing long paths. Unfortunately, in worst-case instances, competing with the overall longest path is impossible while incentivizing (approximate) truthfulness, i.e., requiring that hiding nodes cannot increase a player's utility by more than a factor of $1 + o(1)$. We therefore adopt a semi-random model where a small ($o(n)$) number of random edges are added to worst-case instances. While it remains impossible for truthful mechanisms to compete with the overall longest path, we give a truthful mechanism that competes with a weaker but non-trivial benchmark: the length of any path whose subpaths within each player have a minimum average length. In fact, our mechanism satisfies even a stronger notion of truthfulness, which we call matching-time incentive compatibility. This notion of truthfulness requires that each player not only reports her nodes truthfully but also does not stop the returned path at any of her nodes in order to divert it to a continuation inside her own subgraph.

Posted Content
TL;DR: In this paper, a mechanism design approach is used to study the data buyer's optimal data market model with differential privacy, where each data owner privately possesses an intrinsic motive and an instrumental motive.
Abstract: Privacy is essential in data trading markets. This work uses a mechanism design approach to study the data buyer's optimal data market model with differential privacy. Motivated by the discovery of individuals' dual motives for privacy protection, we consider that each data owner privately possesses an intrinsic motive and an instrumental motive. We study optimal market design in a dynamic environment by determining the privacy assignment rule that specifies the privacy protection at each data usage and the payment rules to compensate for the privacy loss when the owners' instrumental motive is endogenously dynamic due to the buyer's dynamic activities. Due to the privacy-utility tradeoff of differential privacy, privacy loss is inevitable when data is traded with privacy protection. To mitigate the risk of uncertainties, we allow the owners to leave the market using optimal stopping time if the accumulated privacy loss is beyond their privacy budgets that depend on their intrinsic motives. In order to influence the data owners' stopping decisions, the data buyer uses a stopping payment rule that is independent of the data owners' preferences and specifies a monetary transfer to a data owner only at the period when he decides to stop at the end of that period. We introduce the notion of dynamic incentive compatibility to capture the joint deviations from optimal stopping and truthful reporting. Under a monotonicity assumption about the dynamics, the optimal stopping rule can be formulated as a threshold-based rule. A design principle is provided by a sufficient condition of dynamic incentive compatibility. We relax the buyer's optimal market design by characterizing the monetary transfer rules in terms of privacy assignment rules and the threshold functions. To address the analytical intractability, we provide a sufficient condition for a relaxed dynamic incentive-compatible model.

Posted Content
TL;DR: A modification of Zielonka's classic algorithm that brings its complexity down to O(n^{\mathcal{O}\left(\log\left(1+\frac{d}{\log n}\right)\right)} is presented in this paper.
Abstract: Zielonka's classic recursive algorithm for solving parity games is perhaps the simplest among the many existing parity game algorithms. However, its complexity is exponential, while currently the state-of-the-art algorithms have quasipolynomial complexity. Here, we present a modification of Zielonka's classic algorithm that brings its complexity down to $n^{\mathcal{O}\left(\log\left(1+\frac{d}{\log n}\right)\right)}$, for parity games of size $n$ with $d$ priorities, in line with previous quasipolynomial-time solutions.

Posted Content
TL;DR: In this paper, the authors study the group-fair facility location problem where agents are divided into groups based on criteria such as race, gender, or age, and design mechanisms to locate a facility to (approximately) minimize the costs of groups of agents to the facility fairly.
Abstract: Motivated by the societal need to provide fair accessibility or representation among groups of agents, we study the group-fair facility location problems where agents are divided into groups based on criteria such as race, gender, or age. The agents are located on a real line, modeling agents' private ideal preferences/points for the facility's location (e.g., a public school or representative). Our aim is to design mechanisms to locate a facility to (approximately) minimize the costs of groups of agents to the facility fairly while eliciting the agents' private locations truthfully. We first introduce various well-motivated group-fair cost objectives and show that many natural objectives have an unbounded approximation ratio. We then consider the objectives of minimizing the maximum total group cost and minimizing the average group cost. For the first objective, we show that the approximation ratio of the median mechanism depends on the number of groups and provide a new group-based mechanism with an approximation ratio of 3. For the second objective, the median mechanism obtains a ratio of 3, and we propose a randomized mechanism that obtains a better approximation ratio. We also provide lower bounds for both objectives. We then study the notion of intergroup and intragroup fairness that measures fairness between groups and within each group. We consider various objectives and provide mechanisms with tight approximation ratios.