scispace - formally typeset
Search or ask a question

Showing papers on "Stackelberg competition published in 2020"


Journal ArticleDOI
TL;DR: Under Stackelberg equilibrium (SE), the costs incurred by a consumer for procuring either the RES or nRES are significantly reduced while the derived utility by producer is maximized and the CO2 emission cost and consequently the energy cost are minimized.
Abstract: Traditionally, energy consumers pay non-commodity charges (e.g., transmission, environmental and network costs) as a major component of their energy bills. With the distributed energy generation, enabling energy consumption close to producers can minimize such costs. The physically constrained energy prosumers in power networks can be logically grouped into virtual microgrids (VMGs) using telecommunication systems. Prosumer benefits can be optimised by modelling the energy trading interactions among producers and consumers in a VMG as a Stackelberg game in which producers lead and consumers follow. Considering renewable (RES) and non-renewable energy (nRES) resources, and given that RES are unpredictable thus unschedulable, we also describe cost and utility models that include load uncertainty demands of producers. The results show that under Stackelberg equilibrium (SE), the costs incurred by a consumer for procuring either the RES or nRES are significantly reduced while the derived utility by producer is maximized. We further show that when the number of prosumers in the VMG increases, the CO2 emission cost and consequently the energy cost are minimized at the SE. Lastly, we evaluate the peer-to-peer (P2P) energy trading scenario involving noncooperative energy prosumers with and without Stackelberg game. The results show that the P2P energy prosumers attain 47% higher benefits with Stackelberg game.

196 citations


Journal ArticleDOI
TL;DR: In this paper, a peer-to-peer (P2P) energy trading scheme that can help a centralized power system to reduce the total electricity demand of its customers at the peak hour is proposed.
Abstract: This paper proposes a peer-to-peer (P2P) energy trading scheme that can help a centralized power system to reduce the total electricity demand of its customers at the peak hour. To do so, a cooperative Stackelberg game is formulated, in which the centralized power system acts as the leader that needs to decide on a price at the peak demand period to incentivize prosumers to not seek any energy from it. The prosumers, on the other hand, act as followers and respond to the leader’s decision by forming suitable coalitions with neighboring prosumers in order to participate in P2P energy trading to meet their energy demand. The properties of the proposed Stackelberg game are studied. It is shown that the game has a unique and stable Stackelberg equilibrium, as a result of the stability of prosumers’ coalitions. At the equilibrium, the leader chooses its strategy using a derived closed-form expression, while the prosumers choose their equilibrium coalition structure. An algorithm is proposed that enables the centralized power system and the prosumers to reach the equilibrium solution. Numerical case studies demonstrate the beneficial properties of the proposed scheme.

155 citations


Journal ArticleDOI
TL;DR: This paper develops a consortium blockchain-based secure energy trading mechanism for V2G, and proposes an efficient incentive mechanism based on contract theory and edge computing to improve the successful probability of block creation.
Abstract: Smart grid has emerged as a successful application of cyber-physical systems in the energy sector. Among numerous key technologies of the smart grid, vehicle-to-grid (V2G) provides a promising solution to reduce the level of demand–supply mismatch by leveraging the bidirectional energy-trading capabilities of electric vehicles. In this paper, we propose a secure and efficient V2G energy trading framework by exploring blockchain, contract theory, and edge computing. First, we develop a consortium blockchain-based secure energy trading mechanism for V2G. Then, we consider the information asymmetry scenario, and propose an efficient incentive mechanism based on contract theory. The social welfare optimization problem falls into the category of difference of convex programming and is solved by using the iterative convex–concave procedure algorithm. Next, edge computing has been incorporated to improve the successful probability of block creation. The computational resource allocation problem is modeled as a two-stage: 1) Stackelberg leader–follower game and 2) the optimal strategies are obtained by using the backward induction approach. Finally, the performance of the proposed framework is validated via numerical results and theoretical analysis.

144 citations


Journal ArticleDOI
01 Mar 2020
TL;DR: The analytically obtained equilibrium solution of a Stackelberg game indicates that with a limited budget, the model owner should judiciously decide on the number of workers due to trade off between the diversity provided by the number and the latency of completing the training.
Abstract: Due to the large size of the training data, distributed learning approaches such as federated learning have gained attention recently. However, the convergence rate of distributed learning suffers from heterogeneous worker performance. In this letter, we consider an incentive mechanism for workers to mitigate the delays in completion of each batch. We analytically obtained equilibrium solution of a Stackelberg game. Our numerical results indicate that with a limited budget, the model owner should judiciously decide on the number of workers due to trade off between the diversity provided by the number of workers and the latency of completing the training.

124 citations


Journal ArticleDOI
TL;DR: The proposed mechanism realizes improving the mining utility in mining networks while ensuring the maximum profit of edge cloud operator under the proposed mechanism, mining networks obtain 6.86% more profits on average.
Abstract: Blockchain technology is developing rapidly and has been applied in various aspects, among which there are broad prospects in Internet of Things (IoT). However, IoT mobile devices are restricted in communication and computation due to mobility and portability, so that they can’t afford the high computing cost for blockchain mining process. To solve it, the free resources displayed on non-mining-devices and edge cloud are selected to construct collaborative mining network(CMN) to execute mining tasks for mobile blockchain. Miners can offload their mining tasks to non-mining-devices within a CMN or the edge cloud when there are insufficient resources. Considering competition for resource of non-mining-devices, resource allocation problem in a CMN is formulated as a double auction game, among which Bayes-Nash Equilibrium (BNE) is analyzed to figure out the optimal auction price. When offloading to edge cloud, Stackelberg game is adopted to model interactions between edge cloud operator and different CMNs to obtain the optimal resource price and devices’ resource demands. The mechanism realizes improving the mining utility in mining networks while ensuring the maximum profit of edge cloud operator. Finally, profits of mining networks are compared with an existing mode which only considers offloading to edge cloud. Under the proposed mechanism, mining networks obtain 6.86% more profits on average.

109 citations


Journal ArticleDOI
TL;DR: In this paper, a game theory-based pricing model is proposed in a localized Practical Byzantine Fault Tolerance based-Consortium Blockchain (PBFT-CB), where both the interactions between seller and buyer and the interactions among sellers are considered.

105 citations


Journal ArticleDOI
TL;DR: In this paper, the authors study the strategic interactions between an aggregator, its consumers and the day-ahead electricity market using a bilevel optimization framework, where the aggregator-consumer interaction is captured either as a Stackelberg or a Nash Bargaining Game, leveraging chance-constrained programming to model limited controllability of residential DR loads.
Abstract: To decarbonize the heating sector, residential consumers may install heat pumps. Coupled with heating loads with high thermal inertia, these thermostatically controlled loads may provide a significant source of demand side flexibility. Since the capacity of residential consumers is typically insufficient to take part in the day-ahead electricity market (DAM), aggregators act as mediators that monetize the flexibility of these loads through demand response (DR). In this paper, we study the strategic interactions between an aggregator, its consumers and the DAM using a bilevel optimization framework. The aggregator-consumer interaction is captured either as a Stackelberg or a Nash Bargaining Game, leveraging chance-constrained programming to model limited controllability of residential DR loads. The aggregator takes strategic positions in the DAM, considering the uncertainty on the market outcome, represented as a stochastic Stackelberg Game. Results show that the DR provider-aggregator cooperation may yield significant monetary benefits. The aggregator cost-effectively manages the uncertainty on the DAM outcome and the limited controllability of its consumers. The presented methodology may be used to assess the value of DR in a deregulated power system or may be directly integrated in the daily routine of DR aggregators.

94 citations


Journal ArticleDOI
TL;DR: This study presents a bi-level optimization model to describe the interaction behaviors between decision makers and moderator, and develops the consensus mechanism with maximum-return modifications and minimum-cost feedback (MRMCCM).

92 citations


Journal ArticleDOI
TL;DR: The energy scheduling for a three-level IES is investigated by applying the hierarchical Stackelberg game approach and the operation strategies of all market participants are derived with analytical solutions, verified by a decentralized algorithm developed in this study.

87 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider autonomous racing of two cars and present an approach to formulate racing decisions as a noncooperative nonzero-sum game, where the players aim to fulfill static track constraints as well as avoid collision with each other.
Abstract: We consider autonomous racing of two cars and present an approach to formulate racing decisions as a noncooperative nonzero-sum game. We design three different games where the players aim to fulfill static track constraints as well as avoid collision with each other; the latter constraint depends on the combined actions of the two players. The difference between the games is the collision constraints and the payoff. In the first game, collision avoidance is only considered by the follower, and each player maximizes their own progress toward the finish line. We show that, thanks to the sequential structure of this game, equilibria can be computed through an efficient sequential maximization approach. Furthermore, we show that these actions, if feasible, are also a Stackelberg and Nash equilibrium in pure strategies of our second game where both players consider the collision constraints. The payoff of our third game is designed to promote blocking, by additionally rewarding the cars for staying ahead at the end of the horizon. We show that this changes the Stackelberg equilibrium, but has a minor influence on the Nash equilibria. For online implementation, we propose to play the games in a moving horizon fashion and discuss two methods for guaranteeing feasibility of the resulting coupled repeated games. Finally, we study the performance of the proposed approaches in simulation for a setup that replicates the miniature race car tested at the Automatic Control Laboratory, ETH Zurich, Switzerland. The simulation study shows that the presented games can successfully model different racing behaviors and generate interesting racing situations.

81 citations


Journal ArticleDOI
TL;DR: It is found that the government's over-allocated carbon credits may damage the manufacturer's profit with wholesale price or revenue sharing contract, which can increase the difficulty of implementing cap-and-trade regulation.
Abstract: After consideration of a supply chain consisting of a manufacturer and a retailer, this paper uses a two-stage Stackelberg game to explore the production decision as well as the government cap sett...

Journal ArticleDOI
TL;DR: A unique Stackelberg equilibrium is achieved where the SES provider's revenue is maximized and the user-level social cost is minimized, which also rewards the retailer, and results confirm that the retailer can also benefit financially, in addition to the S ES provider and the users.
Abstract: Here, a novel energy trading system is proposed for demand-side management of a neighborhood area network (NAN) consisting of a shared energy storage (SES) provider, users with non-dispatchable energy generation, and an electricity retailer. In a leader–follower Stackelberg game, the SES provider first maximizes their revenue by setting a price signal and trading energy with the grid. Then, by following the SES provider's actions, the retailer minimizes social cost for the users, i.e., the sum of the total users’ cost when they interact with the SES and the total cost for supplying grid energy to the users. A pricing strategy, which incorporates mechanism design, is proposed to make the system incentive-compatible by rewarding users who disclose true energy usage information. A unique Stackelberg equilibrium is achieved where the SES provider's revenue is maximized and the user-level social cost is minimized, which also rewards the retailer. A case study with realistic energy demand and generation data demonstrates 28–45% peak demand reduction of the NAN, depending on the number of participating users, compared to a system without SES. Simulation results confirm that the retailer can also benefit financially, in addition to the SES provider and the users.

Journal ArticleDOI
TL;DR: This work shows an incentive-based interaction between the crowdsourcing platform and the participating client’s independent strategies for training a global learning model, where each side maximizes its own benefit and proposes a novel crowdsourcing framework to leverage FL that considers the communication efficiency during parameters exchange.
Abstract: Federated learning (FL) rests on the notion of training a global model in a decentralized manner. Under this setting, mobile devices perform computations on their local data before uploading the required updates to improve the global model. However, when the participating clients implement an uncoordinated computation strategy, the difficulty is to handle the communication efficiency (i.e., the number of communications per iteration) while exchanging the model parameters during aggregation. Therefore, a key challenge in FL is how users participate to build a high-quality global model with communication efficiency. We tackle this issue by formulating a utility maximization problem, and propose a novel crowdsourcing framework to leverage FL that considers the communication efficiency during parameters exchange. First, we show an incentive-based interaction between the crowdsourcing platform and the participating client’s independent strategies for training a global learning model, where each side maximizes its own benefit. We formulate a two-stage Stackelberg game to analyze such scenario and find the game’s equilibria. Second, we formalize an admission control scheme for participating clients to ensure a level of local accuracy. Simulated results demonstrate the efficacy of our proposed solution with up to 22% gain in the offered reward.

Posted Content
Peng Hang1, Chen Lv1, Yang Xing1, Chao Huang1, Zhongxu Hu1 
TL;DR: Testing results indicate that both the Nash equilibrium and Stackelberg game theoretic approaches can provide reasonable human-like decision making for AVs.
Abstract: Considering that human-driven vehicles and autonomous vehicles (AVs) will coexist on roads in the future for a long time, how to merge AVs into human drivers traffic ecology and minimize the effect of AVs and their misfit with human drivers, are issues worthy of consideration. Moreover, different passengers have different needs for AVs, thus, how to provide personalized choices for different passengers is another issue for AVs. Therefore, a human-like decision making framework is designed for AVs in this paper. Different driving styles and social interaction characteristics are formulated for AVs regarding driving safety, ride comfort and travel efficiency, which are considered in the modeling process of decision making. Then, Nash equilibrium and Stackelberg game theory are applied to the noncooperative decision making. In addition, potential field method and model predictive control (MPC) are combined to deal with the motion prediction and planning for AVs, which provides predicted motion information for the decision-making module. Finally, two typical testing scenarios of lane change, i.e., merging and overtaking, are carried out to evaluate the feasibility and effectiveness of the proposed decision-making framework considering different human-like behaviors. Testing results indicate that both the two game theoretic approaches can provide reasonable human-like decision making for AVs. Compared with the Nash equilibrium approach, under the normal driving style, the cost value of decision making using the Stackelberg game theoretic approach is reduced by over 20%.

Posted Content
TL;DR: A new framework that casts MBRL as a game between a policy player and a model player, which attempts to maximize rewards under the learned model, and which gives rise to two natural families of algorithms for MBRL based on which player is chosen as the leader in the Stackelberg game.
Abstract: Model-based reinforcement learning (MBRL) has recently gained immense interest due to its potential for sample efficiency and ability to incorporate off-policy data. However, designing stable and efficient MBRL algorithms using rich function approximators have remained challenging. To help expose the practical challenges in MBRL and simplify algorithm design from the lens of abstraction, we develop a new framework that casts MBRL as a game between: (1) a policy player, which attempts to maximize rewards under the learned model; (2) a model player, which attempts to fit the real-world data collected by the policy player. For algorithm development, we construct a Stackelberg game between the two players, and show that it can be solved with approximate bi-level optimization. This gives rise to two natural families of algorithms for MBRL based on which player is chosen as the leader in the Stackelberg game. Together, they encapsulate, unify, and generalize many previous MBRL algorithms. Furthermore, our framework is consistent with and provides a clear basis for heuristics known to be important in practice from prior works. Finally, through experiments we validate that our proposed algorithms are highly sample efficient, match the asymptotic performance of model-free policy gradient, and scale gracefully to high-dimensional tasks like dexterous hand manipulation. Additional details and code can be obtained from the project page at this https URL

Journal ArticleDOI
TL;DR: The results show that products’ deterioration rate and quality dropping rate have significant impacts to the firms’ delivery time decisions, as well as the pricing and inventory decisions, in a single-retailer-single-vendor dual-channel supply chain model.
Abstract: Dual-channel supply chain structure, i.e., a traditional retail channel added by an online direct channel, is widely adopted by a lot of firms, including some companies selling deteriorating products (e.g. fruits, vegetables and meats, etc.). However, few papers in literature consider deterioration property of products in dual-channel business models. In this paper, a single-retailer-single-vendor dual-channel supply chain model is studied, in which the vendor sells deteriorating products through its direct online channel and the indirect retail channel. In addition to quantity deterioration, quality of the products also drops with time and affects the demand rate in the retail channel. The pricing decisions and the inventory decisions for the two firms are simultaneously studied. Models of centralized (i.e., the two firms make decisions jointly) and decentralized (i.e., the two firms make decisions separately, vendor as the Stackelberg leader) problems are established. Proper algorithms are proposed to obtain the optimal decisions of prices, ordering frequencies and ordering quantities. The results suggest that decentralization of the supply chain not only erodes the two firms’ profit, but also incurs higher wastes comparing to that under centralization. However, a revenue sharing and two part tariff contract can coordinate the supply chain. Under utilizing the contract, each firm’s profit is improved and the total waste rate of the supply chain is reduced. It is also shown that the contract is more efficient for both firms under higher product deterioration rate. Besides, the contract is more efficient for the retailer, while less efficient for the vendor under higher quality dropping rate. In the model extension, online channel delivery time is assumed to be endogenous and linked to demands in both channels. The results show that products’ deterioration rate and quality dropping rate have significant impacts to the firms’ delivery time decisions, as well as the pricing and inventory decisions.

Journal ArticleDOI
TL;DR: In this paper, a three-level closed-loop supply chain consisting of a manufacturer, retailer, and third-party collector is considered, where the manufacturer builds simultaneously new products from raw materials and remanufactures the returning products.

Journal ArticleDOI
TL;DR: A novel Wirelessly Powered Edge intelliGence (WPEG) framework is proposed, which aims to achieve a stable, robust, and sustainable edge intelligence by energy harvesting methods, and a permissioned edge blockchain is built to secure the peer-to-peer energy and knowledge sharing in this framework.
Abstract: Recently, edge artificial intelligence techniques (e.g., federated edge learning) are emerged to unleash the potential of big data from Internet of Things (IoT). By learning knowledge on local devices, data privacy-preserving and quality of service (QoS) are guaranteed. Nevertheless, the dilemma between the limited on-device battery capacities and the high energy demands in learning is not resolved. When the on-device battery is exhausted, the edge learning process will have to be interrupted. In this paper, we propose a novel Wirelessly Powered Edge intelliGence (WPEG) framework, which aims to achieve a stable, robust, and sustainable edge intelligence by energy harvesting (EH) methods. Firstly, we build a permissioned edge blockchain to secure the peer-to-peer (P2P) energy and knowledge sharing in our framework. To maximize edge intelligence efficiency, we then investigate the wirelessly-powered multi-agent edge learning model and design the optimal edge learning strategy. Moreover, by constructing a two-stage Stackelberg game, the underlying energy-knowledge trading incentive mechanisms are also proposed with the optimal economic incentives and power transmission strategies. Finally, simulation results show that our incentive strategies could optimize the utilities of both parties compared with classic schemes, and our optimal learning design could realize the optimal learning efficiency.

Journal ArticleDOI
Shuangrui Yin1, Qian Ai1, Zhaoyu Li1, Yufan Zhang1, Tianguang Lu1 
TL;DR: Simulation results prove the rationality and validity of the proposed model and method and solve the problem of day-ahead energy management for aggregate prosumers considering the uncertainty of intermittent renewable energy output and market price.

Journal ArticleDOI
TL;DR: In this article, the impacts of pricing time on profitability and stability of a supply chain system under policy intervention was investigated. But, the authors focused on the development of electric vehicles.

Journal ArticleDOI
TL;DR: A Stackelberg game-based framework in which EUs and MECs act as followers and leaders, respectively is proposed in which each MEC achieves the maximum revenue while each EU obtains utility-maximized resources under budget constraints.

Proceedings Article
12 Jul 2020
TL;DR: This work provides insights into the optimization landscape of zero-sum games by establishing connections between Nash and Stackelberg equilibria along with the limit points of simultaneous gradient descent and derive novel gradient-based learning dynamics emulating the natural structure of a StACkelberg game using the implicit function theorem.
Abstract: Contemporary work on learning in continuous games has commonly overlooked the hierarchical decision-making structure present in machine learning problems formulated as games, instead treating them as simultaneous play games and adopting the Nash equilibrium solution concept. We deviate from this paradigm and provide a comprehensive study of learning in Stackelberg games. This work provides insights into the optimization landscape of zero-sum games by establishing connections between Nash and Stackelberg equilibria along with the limit points of simultaneous gradient descent. We derive novel gradient-based learning dynamics emulating the natural structure of a Stackelberg game using the implicit function theorem and provide convergence analysis for deterministic and stochastic updates for zero-sum and general-sum games. Notably, in zero-sum games using deterministic updates, we show the only critical points the dynamics converge to are Stackelberg equilibria and provide a local convergence rate. Empirically, our learning dynamics mitigate rotational behavior and exhibit benefits for training generative adversarial networks compared to simultaneous gradient descent.

Journal ArticleDOI
TL;DR: Simulation results show that the proposed game theoretic approaches can motivate EVs to smooth out the power fluctuations from the grid while EVs schedule their charging/discharging activities to maximize their utilities.
Abstract: Due to the increasing popularity of electric vehicles (EVs) and technological advancements of EV electronics, the vehicle-to-grid (V2G) technique, which utilizes EVs to provide ancillary services for the power grid, stimulates new ideas in current smart grid research. Since EVs are selfish individuals owned by different parties, how to motivate them to provide ancillary services becomes an issue. In this paper, game theoretic approaches using non-cooperative and cooperative game are proposed to motivate EVs to provide frequency regulation services for the power grid. In a non-cooperative V2G system, the interaction between the EV aggregator and EVs is formulated as a non-cooperative Stackelberg game. The EV aggregator as the leader decides the electricity trading price, and EVs as the followers determine their charging/discharging strategies. In a cooperative V2G system, a potential game is formulated to achieve the optimal social welfare of the V2G system. The existence and uniqueness of the Nash equilibrium of these two games are validated. Our simulation results show that the proposed game theoretic approaches can motivate EVs to smooth out the power fluctuations from the grid while EVs schedule their charging/discharging activities to maximize their utilities. This demonstrates the effectiveness of the use of the V2G game in providing regulation services to the grid. Through cooperation and extra information exchange, the social welfare of EVs and the EV aggregator can be improved to the global optimum and the V2G regulation services can also achieve near-optimal performance.

Journal ArticleDOI
TL;DR: A computation offloading mechanism based on two-stage Stackelberg game to analyze the interaction between multiple edge clouds and multiple IIoT devices and results show that the proposed scheme is conducive to seeking the appropriate price and computation requirement.
Abstract: Relying on the computation offloading technology, edge computing has shown potential in countless tasks processing in the industrial Internet of Things (IIoT), which is composed of multiple edge clouds and multiple IIoT devices. Nevertheless, with increasing demands for computation service, how to design reliable transmission mechanism and allocate proper computation resource has become bottlenecks. In this article, we propose a computation offloading mechanism based on two-stage Stackelberg game to analyze the interaction between multiple edge clouds and multiple IIoT devices. To be specific, the edge clouds are denoted as leaders who set the appropriate price for their computation resource. Besides considering the payment cost, the IIoT devices which are termed as the followers formulate their utility function by considering the social interaction information from the potential IIoT devices. The existence and uniqueness of the Stackelberg equilibrium are analyzed considering two possible cases, i.e., complete information and incomplete information. Moreover, two dynamic iterative algorithms are invoked for solving both problem models, respectively. Finally, experimental results show that our proposed scheme is conducive to seeking the appropriate price and computation requirement. Besides, social interaction information plays an important role in achieving a reasonable computation requirement for IIoT devices.

Journal ArticleDOI
TL;DR: Considering reference emission and cost learning effects, the authors investigates a Stackelberg differential game, where a manufacturer acts as a leader and determines the wholesale price and emission reduction level, and a retailer acting as a follower and sets the retail price.

Journal ArticleDOI
TL;DR: Three dynamic pricing mechanisms for resource allocation of edge computing for the IoT environment with a comparative analysis are considered: BID-proportional allocation mechanism, uniform pricing mechanism, and fairness-seeking differentiated pricing mechanism (FAID-PRIM).
Abstract: With the widespread use of Internet of Things (IoT), edge computing has recently emerged as a promising technology to tackle low-latency and security issues with personal IoT data. In this regard, many works have been concerned with computing resource allocation of the edge computing server, and some studies have conducted to the pricing schemes for resource allocation additionally. However, few works have attempted to address the comparison among various kinds of pricing schemes. In addition, some schemes have their limitations such as fairness issues on differentiated pricing schemes. To tackle these limitations, this article considered three dynamic pricing mechanisms for resource allocation of edge computing for the IoT environment with a comparative analysis: BID-proportional allocation mechanism (BID-PRAM), uniform pricing mechanism (UNI-PRIM), and fairness-seeking differentiated pricing mechanism (FAID-PRIM). BID-PRAM is newly proposed to overcome the limitation of the auction-based pricing scheme; UNI-PRIM is a basic uniform pricing scheme; FAID-PRIM is newly proposed to tackle the fairness issues of the differentiated pricing scheme. BID-PRAM is formulated as a noncooperative game. UNI-PIM and FAID-PRIM are formulated as a single-leader–multiple-followers Stackelberg game. In each mechanism, the Nash equilibrium (NE) or Stackelberg equilibrium (SE) solution is given with the proof of existence and uniqueness. Numerical results validate the proposed theorems and present a comparative analysis of three mechanisms. Through these analyses, the advantages and disadvantages of each model are identified, to give edge computing service providers guidance on various kinds of pricing schemes.

Journal ArticleDOI
TL;DR: In this paper, the authors investigated the effect of leadership structure on the value of guarantor financing in a capital-constrained supply chain and the impact of leader structure on finance decisions.
Abstract: This study investigates manufacturer guarantor financing (MG) and third‐party logistics (3PL) guarantor financing (LG) in a four‐party supply chain game that features a manufacturer, a 3PL, a capital‐constrained retailer, and a bank. The manufacturer or 3PL can act as the guarantor for the retailer who borrows bank credit. Two different leadership structures are investigated, namely, Nash game and manufacturer leadership Stackelberg game, where the manufacturer and 3PL decide simultaneously and sequentially, respectively. The supply chain under both leadership structures prefers guarantor financing to traditional bank financing when the supply chain is sufficiently cost‐efficient. In the Nash game, however, firms encounter a free‐rider dilemma when choosing between MG and LG, wherein both potential guarantors prefer the other to be the guarantor. This free‐rider dilemma can be resolved in the Stackelberg game. We also observe the follower–guarantor advantage in the Stackelberg game, wherein all firms favor the follower to provide guarantor financing. Our analysis shows that the supply chain under guarantor financing with a longer decision hierarchy (i.e., the Stackelberg game) can be conditionally more effective than that with a shorter one (i.e., the Nash game). By further analyzing different cost structures, pricing mechanism, and retailer’s initial capital, we find that most of our qualitative results remain accurate under more sophisticated conditions. These findings enhance our understanding of the value of guarantor financing in a capital‐constrained supply chain and the impact of leadership structure on financing decisions.

Journal ArticleDOI
03 Apr 2020
TL;DR: This paper proposes a novel bi-level actor-critic learning method that allows agents to have different knowledge base (thus intelligent), while their actions still can be executed simultaneously and distributedly and considers Stackelberg equilibrium as a potentially better convergence point than Nash equilibrium in terms of Pareto superiority.
Abstract: Coordination is one of the essential problems in multi-agent systems. Typically multi-agent reinforcement learning (MARL) methods treat agents equally and the goal is to solve the Markov game to an arbitrary Nash equilibrium (NE) when multiple equilibra exist, thus lacking a solution for NE selection. In this paper, we treat agents unequally and consider Stackelberg equilibrium as a potentially better convergence point than Nash equilibrium in terms of Pareto superiority, especially in cooperative environments. Under Markov games, we formally define the bi-level reinforcement learning problem in finding Stackelberg equilibrium. We propose a novel bi-level actor-critic learning method that allows agents to have different knowledge base (thus intelligent), while their actions still can be executed simultaneously and distributedly. The convergence proof is given, while the resulting learning algorithm is tested against the state of the arts. We found that the proposed bi-level actor-critic algorithm successfully converged to the Stackelberg equilibria in matrix games and find a asymmetric solution in a highway merge environment.

Journal ArticleDOI
TL;DR: A two-echelon supply chain in which an upstream manufacturer and a downstream retailer share the product liability cost caused by quality defects is considered, finding that a shift of the share of the ULQ liability cost from R to M enhances the supply chain efficiency under the RS structure.

Journal ArticleDOI
TL;DR: In this paper, the authors consider the pricing strategies of competing dual-channel retailers, focusing on whether and when they should adopt the BOPS strategy, and explore the impacts of market factors on the equilibrium outcomes.
Abstract: The “buy online and pick up in store” (BOPS) mode is gaining tremendous popularity among retailers since it is convenient for consumers and brings additional store sales to retailers. However, operating the BOPS channel requires additional investment, which is a challenge for retailers. This article considers the pricing strategies of competing dual-channel retailers, focuses on whether and when they should adopt the BOPS strategy, and explores the impacts of market factors on the equilibrium outcomes. Since retailers’ decisions are usually made sequentially in reality, we use the Stackelberg game model to analyze retailers’ optimal strategies. First, we show that the follower's price is not always lower than the leader's price. Specifically, when the unit additional profit from cross-selling of the follower is low enough, the follower will set a higher price than the leader. Second, we find that retailers prefer the BOPS strategy when the fixed costs for offering BOPS channels are low enough, or when the difference between the additional profits from cross-selling of two retailers is sufficiently large. Third, we present an interesting insight: an increase in product return probability or retailer cost of handling a returned product can be beneficial to retailers.