scispace - formally typeset
Search or ask a question
Proceedings Article

TacTex'13: a champion adaptive power trading agent

27 Jul 2014-pp 465-471
TL;DR: The complex decision-making problem that Tac Tex'13 faces is formalized, and its solution is approximate in TacTex'13's constituent components, as well as the success of the complete agent through analysis of competition results.
Abstract: Sustainable energy systems of the future will no longer be able to rely on the current paradigm that energy supply follows demand. Many of the renewable energy resources do not produce power on demand, and therefore there is a need for new market structures that motivate sustainable behaviors by participants. The Power Trading Agent Competition (Power TAC) is a new annual competition that focuses on the design and operation of future retail power markets, specifically in smart grid environments with renewable energy production, smart metering, and autonomous agents acting on behalf of customers and retailers. It uses a rich, open-source simulation platform that is based on real-world data and state-of-the-art customer models. Its purpose is to help researchers understand the dynamics of customer and retailer decision-making, as well as the robustness of proposed market designs. This paper introduces TACTEX'13, the champion agent from the inaugural competition in 2013. TACTEX'13 learns and adapts to the environment in which it operates, by heavily relying on reinforcement learning and prediction methods. This paper describes the constituent components of TACTEX'13 and examines its success through analysis of competition results and subsequent controlled experiments.
Citations
More filters
Journal ArticleDOI
TL;DR: A conceptual framework for strategy summarization is proposed, which is envisioning as a collaborative process that involves both agents and people, and possible testbeds that could be used to evaluate progress in research on strategy summaryization are suggested.
Abstract: Intelligent agents and AI-based systems are becoming increasingly prevalent. They support people in different ways, such as providing users with advice, working with them to achieve goals or acting on users’ behalf. One key capability missing in such systems is the ability to present their users with an effective summary of their strategy and expected behaviors under different conditions and scenarios. This capability, which we see as complementary to those currently under development in the context of “interpretable machine learning” and “explainable AI”, is critical in various settings. In particular, it is likely to play a key role when a user needs to collaborate with an agent, when having to choose between different available agents to act on her behalf, or when requested to determine the level of autonomy to be granted to an agent or approve its strategy. In this paper, we pose the challenge of developing capabilities for strategy summarization, which is not addressed by current theories and methods in the field. We propose a conceptual framework for strategy summarization, which we envision as a collaborative process that involves both agents and people. Last, we suggest possible testbeds that could be used to evaluate progress in research on strategy summarization.

45 citations

Proceedings ArticleDOI
06 Dec 2015
TL;DR: The trading strategies of AgentUDE, which won the Power Trading Agent Competition 2014 Final using adaptive trading strategies, are detailed and the tournament data is analyzed from the winner agent's point of view.
Abstract: Local energy production and distributed energy storage facilities will take a leading position in the future smart grid along with the challenge of sustainability. To cope with this, smart grid simulation platforms are needed to analyze the problems of two-way data and energy flow. The Power Trading Agent Competition (Power TAC) provides an open source, smart grid simulation platform to enable various smart grid studies from the perspective of sustainability. Besides, an annual competition is held in which broker agents trade in energy markets to meet their supply and demand. AgentUDE is one of the broker agents competed in the Power TAC 2014 Final. This paper details the trading strategies of AgentUDE, which won the competition using adaptive trading strategies. The paper also analyzes the tournament data from the winner agent's point of view.

8 citations


Cites background from "TacTex'13: a champion adaptive powe..."

  • ...On the other side, it optimizes the future demands, prices and predicted energy costs to pick a suitable tariff among pre-created, fixed-rate candidate tariffs [3]....

    [...]

Proceedings ArticleDOI
01 Dec 2015
TL;DR: This work examines retailers that maximize their relative profit, which is the (absolute) profit relative to the average profit of the other retailers, and shows that the relative Profit, as a function of the own price, has a unique local maximum.
Abstract: We examine retailers that maximize their relative profit, which is the (absolute) profit relative to the average profit of the other retailers. Customer behavior is modelled by a multinomial logit (MNL) demand model. Although retailers with low retail prices attract more customers than retailers high retail prices, the retailer with the lowest retail price, according to this model, does not attract all the customers. We provide first and second order derivatives, and show that the relative profit, as a function of the own price, has a unique local maximum. Our experiments show that relative profit maximizers "beat" absolute profit maximizers, i.e. They outperform absolute profit maximizers if the goal is to make a higher profit. These results provide insight into market simulation competitions, such as the Power TAC.

4 citations


Cites background from "TacTex'13: a champion adaptive powe..."

  • ...In order to deal with the complexity of the simulation, participants agents have used computational intelligence techniques, such as reinforcement learning [23], [24], particle swarm optimization [25], and other adaptive strategies [26]– [28]....

    [...]

Posted Content
TL;DR: In this article, a single unit single-shot double auction with a certain clearing price and payment rule, referred to as ACPR, was analyzed, and the best response for a bidder with complete information was derived.
Abstract: Periodic Double Auctions (PDAs) are commonly used in the real world for trading, e.g. in stock markets to determine stock opening prices, and energy markets to trade energy in order to balance net demand in smart grids, involving trillions of dollars in the process. A bidder, participating in such PDAs, has to plan for bids in the current auction as well as for the future auctions, which highlights the necessity of good bidding strategies. In this paper, we perform an equilibrium analysis of single unit single-shot double auctions with a certain clearing price and payment rule, which we refer to as ACPR, and find it intractable to analyze as number of participating agents increase. We further derive the best response for a bidder with complete information in a single-shot double auction with ACPR. Leveraging the theory developed for single-shot double auction and taking the PowerTAC wholesale market PDA as our testbed, we proceed by modeling the PDA of PowerTAC as an MDP. We propose a novel bidding strategy, namely MDPLCPBS. We empirically show that MDPLCPBS follows the equilibrium strategy for double auctions that we previously analyze. In addition, we benchmark our strategy against the baseline and the state-of-the-art bidding strategies for the PowerTAC wholesale market PDAs, and show that MDPLCPBS outperforms most of them consistently.

2 citations

Book ChapterDOI
28 Jun 2021
TL;DR: In this article, the authors present a trading strategy that, based on this observation, aims to balance gains against costs; and was utilized by the champion of the PowerTAC-2020 tournament, TUC-TAC.
Abstract: The PowerTAC competition provides a multi-agent simulation platform for electricity markets, in which intelligent agents acting as electricity brokers compete with each other aiming to maximize their profits. Typically, the gains of agents increase as the number of their customers rises, but in parallel, costs also increase as a result of higher transmission fees that need to be paid by the electricity broker. Thus, agents that aim to take over a disproportionately high share of the market, often end up with losses due to being obliged to pay huge transmission capacity fees. In this paper, we present a novel trading strategy that, based on this observation, aims to balance gains against costs; and was utilized by the champion of the PowerTAC-2020 tournament, TUC-TAC. The approach also incorporates a wholesale market strategy that employs Monte Carlo Tree Search to determine TUC-TAC’s best course of action when participating in the market’s double auctions. The strategy is improved by making effective use of a forecasting module that seeks to predict upcoming peaks in demand, since in such intervals incurred costs significantly increase. A post-tournament analysis is also included in this paper, to help draw important lessons regarding the strengths and weaknesses of the various strategies used in the PowerTAC-2020 competition.

2 citations

References
More filters
Book
15 Apr 1994
TL;DR: Puterman as discussed by the authors provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models, focusing primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous time discrete state models.
Abstract: From the Publisher: The past decade has seen considerable theoretical and applied research on Markov decision processes, as well as the growing use of these models in ecology, economics, communications engineering, and other fields where outcomes are uncertain and sequential decision-making processes are needed. A timely response to this increased activity, Martin L. Puterman's new work provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models. It discusses all major research directions in the field, highlights many significant applications of Markov decision processes models, and explores numerous important topics that have previously been neglected or given cursory coverage in the literature. Markov Decision Processes focuses primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous-time discrete state models. The book is organized around optimality criteria, using a common framework centered on the optimality (Bellman) equation for presenting results. The results are presented in a "theorem-proof" format and elaborated on through both discussion and examples, including results that are not available in any other book. A two-state Markov decision process model, presented in Chapter 3, is analyzed repeatedly throughout the book and demonstrates many results and algorithms. Markov Decision Processes covers recent research advances in such areas as countable state space models with average reward criterion, constrained models, and models with risk sensitive optimality criteria. It also explores several topics that have received little or no attention in other books, including modified policy iteration, multichain models with average reward criterion, and sensitive optimality. In addition, a Bibliographic Remarks section in each chapter comments on relevant historic

11,625 citations

BookDOI
04 Aug 2011
TL;DR: This book discusses the challenges of dynamic programming, the three curses of dimensionality, and some experimental comparisons of stepsize formulas that led to the creation of ADP for online applications.
Abstract: Preface. Acknowledgments. 1. The challenges of dynamic programming. 1.1 A dynamic programming example: a shortest path problem. 1.2 The three curses of dimensionality. 1.3 Some real applications. 1.4 Problem classes. 1.5 The many dialects of dynamic programming. 1.6 What is new in this book? 1.7 Bibliographic notes. 2. Some illustrative models. 2.1 Deterministic problems. 2.2 Stochastic problems. 2.3 Information acquisition problems. 2.4 A simple modeling framework for dynamic programs. 2.5 Bibliographic notes. Problems. 3. Introduction to Markov decision processes. 3.1 The optimality equations. 3.2 Finite horizon problems. 3.3 Infinite horizon problems. 3.4 Value iteration. 3.5 Policy iteration. 3.6 Hybrid valuepolicy iteration. 3.7 The linear programming method for dynamic programs. 3.8 Monotone policies. 3.9 Why does it work? 3.10 Bibliographic notes. Problems 4. Introduction to approximate dynamic programming. 4.1 The three curses of dimensionality (revisited). 4.2 The basic idea. 4.3 Sampling random variables . 4.4 ADP using the postdecision state variable. 4.5 Lowdimensional representations of value functions. 4.6 So just what is approximate dynamic programming? 4.7 Experimental issues. 4.8 Dynamic programming with missing or incomplete models. 4.9 Relationship to reinforcement learning. 4.10 But does it work? 4.11 Bibliographic notes. Problems. 5. Modeling dynamic programs. 5.1 Notational style. 5.2 Modeling time. 5.3 Modeling resources. 5.4 The states of our system. 5.5 Modeling decisions. 5.6 The exogenous information process. 5.7 The transition function. 5.8 The contribution function. 5.9 The objective function. 5.10 A measuretheoretic view of information. 5.11 Bibliographic notes. Problems. 6. Stochastic approximation methods. 6.1 A stochastic gradient algorithm. 6.2 Some stepsize recipes. 6.3 Stochastic stepsizes. 6.4 Computing bias and variance. 6.5 Optimal stepsizes. 6.6 Some experimental comparisons of stepsize formulas. 6.7 Convergence. 6.8 Why does it work? 6.9 Bibliographic notes. Problems. 7. Approximating value functions. 7.1 Approximation using aggregation. 7.2 Approximation methods using regression models. 7.3 Recursive methods for regression models. 7.4 Neural networks. 7.5 Batch processes. 7.6 Why does it work? 7.7 Bibliographic notes. Problems. 8. ADP for finite horizon problems. 8.1 Strategies for finite horizon problems. 8.2 Qlearning. 8.3 Temporal difference learning. 8.4 Policy iteration. 8.5 Monte Carlo value and policy iteration. 8.6 The actorcritic paradigm. 8.7 Bias in value function estimation. 8.8 State sampling strategies. 8.9 Starting and stopping. 8.10 A taxonomy of approximate dynamic programming strategies. 8.11 Why does it work? 8.12 Bibliographic notes. Problems. 9. Infinite horizon problems. 9.1 From finite to infinite horizon. 9.2 Algorithmic strategies. 9.3 Stepsizes for infinite horizon problems. 9.4 Error measures. 9.5 Direct ADP for online applications. 9.6 Finite horizon models for steady state applications. 9.7 Why does it work? 9.8 Bibliographic notes. Problems. 10. Exploration vs. exploitation. 10.1 A learning exercise: the nomadic trucker. 10.2 Learning strategies. 10.3 A simple information acquisition problem. 10.4 Gittins indices and the information acquisition problem. 10.5 Variations. 10.6 The knowledge gradient algorithm. 10.7 Information acquisition in dynamic programming. 10.8 Bibliographic notes. Problems. 11. Value function approximations for special functions. 11.1 Value functions versus gradients. 11.2 Linear approximations. 11.3 Piecewise linear approximations. 11.4 The SHAPE algorithm. 11.5 Regression methods. 11.6 Cutting planes. 11.7 Why does it work? 11.8 Bibliographic notes. Problems. 12. Dynamic resource allocation. 12.1 An asset acquisition problem. 12.2 The blood management problem. 12.3 A portfolio optimization problem. 12.4 A general resource allocation problem. 12.5 A fleet management problem. 12.6 A driver management problem. 12.7 Bibliographic references. Problems. 13. Implementation challenges. 13.1 Will ADP work for your problem? 13.2 Designing an ADP algorithm for complex problems. 13.3 Debugging an ADP algorithm. 13.4 Convergence issues. 13.5 Modeling your problem. 13.6 Online vs. offline models. 13.7 If it works, patent it!

2,300 citations

Journal ArticleDOI
TL;DR: The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, assessing predictions, handling noisy data and outliers, improving the quality of predictions by tuning fit parameters, and applications of locally weighted learning.
Abstract: This paper surveys locally weighted learning, a form of lazy learning and memory-based learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, assessing predictions, handling noisy data and outliers, improving the quality of predictions by tuning fit parameters, interference between old and new data, implementing locally weighted learning efficiently, and applications of locally weighted learning. A companion paper surveys how locally weighted learning can be used in robot learning and control.

1,863 citations


"TacTex'13: a champion adaptive powe..." refers methods in this paper

  • ...LWR (see, e.g. (Atkeson, Moore, and Schaal 1997)) was chosen since, being non-parametric, it requires very minimal assumptions about the representation of the predicted function (the customer preference function)....

    [...]

Journal ArticleDOI
TL;DR: In June 2000, after two years of fairly smooth operation, California's deregulated wholesale electricity market began producing extremely high prices and threats of supply shortages as discussed by the authors, which demonstrated dramatically why most current electricity markets are extremely volatile: demand is difficult to forecast and exhibits virtually no price responsiveness, while supply faces strict production constraints and prohibitive storage costs.
Abstract: In June 2000, after two years of fairly smooth operation, California's deregulated wholesale electricity market began producing extremely high prices and threats of supply shortages. The upheaval demonstrated dramatically why most current electricity markets are extremely volatile: demand is difficult to forecast and exhibits virtually no price responsiveness, while supply faces strict production constraints and prohibitive storage costs. This structure leads to periods of surplus and of shortage, the latter exacerbated by sellers' ability to exercise market power. Electricity markets can function much more smoothly, however, if they are designed to support price-responsive demand and long-term wholesale contracts for electricity.

566 citations


"TacTex'13: a champion adaptive powe..." refers background in this paper

  • ...All rights reserved. ing opened to competition, however, the transition to competitive markets can be risky (Borenstein 2002)....

    [...]

Journal ArticleDOI
TL;DR: A research agenda for making the smart grid a reality is presented, with a focus on energy efficiency, smart grids and smart cities.
Abstract: The phenomenal growth in material wealth experienced in developed countries throughout the twentieth century has largely been driven by the availability of cheap energy derived from fossil fuels (originally coal, then oil, and most recently natural gas). However, the continued availability of this cheap energy cannot be taken for granted given the growing concern that increasing demand for these fuels (and particularly, demand for oil) will outstrip our ability to produce them (so called 'peak oil'). Many mature oil and gas fields around the world have already peaked and their annual production is now steadily declining. Predictions of when world oil production will peak vary between 0-20 years into the future, but even the most conservative estimates provide little scope for complacency given the significant price increases that peak oil is likely to precipitate. Furthermore, many of the oil and gas reserves that do remain are in environmentally or politically sensitive regions of the world where threats to supply create increased price volatility (as evidenced by the 2010 Deepwater Horizon disaster and 2011 civil unrest in the Middle East). Finally, the growing consensus on the long term impact of carbon emissions from burning fossil fuels suggests that even if peak oil is avoided, and energy security assured, a future based on fossil fuel use will expose regions of the world to damaging climate change that will make the lives of many of the world's poorest people even harder.

513 citations


"TacTex'13: a champion adaptive powe..." refers background in this paper

  • ...As a result, energy consumption patterns will have to adapt to the availability of renewable energy supply (Ramchurn et al. 2012)....

    [...]