Adaptive Strategies for Dynamic Pricing Agents
Abstract: Dynamic Pricing (DyP) is a form of Revenue Management in which the price of a (usually) perishable good is changed over time to increase revenue. It is an effective method that has become even more relevant and useful with the emergence of Internet firms and the possibility of readily and frequently updating prices. In this paper a new approach to DyP is presented. We design an adaptive dynamic pricing strategy and optimize its parameters with an Evolutionary Algorithm (EA) offline, while the strategy can deal with stochastic market dynamics quickly online. We design the adaptive heuristic dynamic pricing strategy in a duopoly where each firm has a finite inventory of a single type of good. We consider two cases, one in which the average of a customer population's stochastic valuation for each of the goods is constant throughout the selling horizon and one in which the average customer valuation for each good is changed according to a random Brownian motion. We also design an agent-based software framework for simulating various dynamic pricing strategies in agent-based marketplaces with multiple firms in a bounded time horizon. We use an EA to optimize the parameters of the pricing strategy in each of the settings and compare our strategy with other strategies from the literature. We also perform sensitivity analysis and show that the optimized strategy works well even when used in settings with varied demand functions.
Summary (4 min read)
- Dynamic Pricing (DyP) is a form of Revenue Management (RM) that involves changing the price of goods or services over time with the aim of increasing revenue.
- Today, the Internet provides exceptional opportunities for practicing RM and particularly DyP.
- By changing prices in time, firms can ask for the price that yields the highest revenue at each moment.
- So on one hand, the strategies are capable of very fast adaptive decision making in run-time, and on the other hand their parameters are optimized offline to tune them for a more specific setting.
- Also, because of the adaptiveness of the proposed pricing strategies, any deviations from the expected dynamics of the market will be detected quickly and accounted for by the strategy online, and thus the strategies also work reasonably well in various market settings different from what they have been tuned for.
- The authors have a market with two competing firms.
- The revenue (sum of price of items sold) of one of the firms is optimized using DyP.
- Each firm can change the price of its goods at the start of equi-distant time intervals.
- The model is a finite horizon model, the goods left at the end of each time step are transferred into the next and all goods are lost at the end of the whole time span.
- Each firm announces a selling price, pj(t), for each good type in each time interval t.
- A cost for each good type, crj , serves as a reserve price for goods of that type.
- The customers specify their preferences using non-negative cardinal utilities that are exchangeable with monetary payments.
- Thus, a customer’s utility for getting an item is equal to the difference between its valuation for the item and the item’s price, which is uj(t) = v(gj) − pj(t) for firm j’s good at time t. 2) Population:.
- This distribution, which is denoted by Prj,t for firm j’s good at time t, may or may not change in time.
- The number of customers that arrive in each time step follows a Poisson process with a constant intensity a.
- They may or may not buy a product based on their choice function and, in any case, leave the market afterwards.
C. Modeling Time
- In any dynamic pricing model, by definition, the firms should be able to adjust their prices in time.
- While changing prices at any particular moment may become more plausible, particularly with internet firms, it is still more common in the literature for the change of prices to occur in fixed time intervals.
- At the start of each time interval, all firms set the prices for their goods.
IV. MARKET SIMULATION
- The authors have developed software for simulating a marketplace described in the previous section.
- The software uses an event queue to keep track of two types of events: pricing events, firms setting the price for each of their item types at the beginning of each time period, and customer arrival events.
- Some notation that helps describe the experiments in the following section is defined here.
- These parameters consist of the properties of the firms (costs of goods, initial stock, etc.) and the valuation distributions and arrival rate of the customers.
- Based on the above definitions, an instance of the problem is deterministic given the firms’ strategies, i.e. will yield the same results when the firms use the same strategies, while a configuration alone does not contain enough information to determine an outcome.
A. The Inventory Based (IB) Strategy
- The first strategy is one that adaptively adjusts the prices for a firm based on the number of goods it has left and the number of goods that it has sold in the previous time interval, the authors call it the Inventory Based (IB) strategy.
- The maxDecPercent and maxIncPercent parameters along with the distance that the sales rate has from the expected sales rate control the amount of change in price in each time step.
- Finally, pastCustomers is the total number of customers in the previous time step and aveCustomers is the average number of customers per time step (same as a).
- Note that this is a dynamic indicator updated in the beginning of each time step, so it will take into account the current state of the agent.
- If α is smaller than one, then the sales rate is too slow, and if it is larger than one, the the inventory would be exhausted sooner than the end of the time horizon, so there is an opportunity for increasing the price.
B. The Revenue Based (RB) Strategy
- The Revenue Based (RB) strategy uses an estimation of a desirable price to estimate the price in each time step.
- The algorithm for this strategy can be seen in algorithm 2.
- These variables are also used: RP , the revenue per customer in the previous time step, and the expected RPC, used as a control parameter, expRPC.
- This is not always the case though, but this can be a safe assumption when the initial price and expected price are chosen properly, as the authors can see from the experimental results.
- Note that both strategies depend only on information from the sales in one previous time step.
C. Computing the Parameters
- The authors want to have settings for the parameters of the strategies that they have defined such that the strategies perform well.
- The authors therefore need black-box optimization algorithms that are capable of tackling a large class of problems effectively.
- AMaLGaM is essentially an Evolutionary Algorithm (EA) in which a normal distribution is estimated from the better, selected solutions and subsequently adapted to be aligned favorably with the local structure of the search space.
- In order to tune the experiments for the EA, which is not designed to handle stochasticity on one hand, and not to over-fit a single instance of the problem on the other, the authors use the following method.
- The parameters of the heuristic strategies defined in section V are optimized for firm 0, given that firm 1 follows a fixed price strategy.
VI. EXPERIMENTAL RESULTS
- The authors ran the EA in this way multiple times for each strategy, both for the case where the customers’ valuation distribution does not change with Brownian motion, and for the case in which it does.
- It also illustrates a case where firm 1 has a slightly more “expensive” good, its cost, price and the customers’ valuation for it are higher than firm 0’s good.
- In the Brownian case, the IB strategy has a 66% increase in profit and the RB algorithm has a 63% increase compared to the fixed price strategy, with similar results for the DF strategy.
- GD still yields an almost %35 profit gain compared to the FP, which is pretty good for an algorithm that is using only one parameter to adjust the price.
- In each cell of these tables, the first number (from top) shows the percentage of instances in which the first (row) strategy performs better than the other (column strategy).
VII. SENSITIVITY ANALYSIS
- In this section the authors study the robustness of the optimized strategies by computing the amount of revenue loss suffered in case of wrong assumptions about the market configuration.
- To do this, the authors run the best performing strategy that they have designed up to now, IB, for some configurations that vary with their default Brownian and non-Brownian configurations (see section VI).
- The authors consider a class of varied configurations where the customer arrival rate, a, is changed compared to the standard non-Brownian configuration discussed above.
- The results show that even in the most severe cases in their experiments less than 10% of the profit can be lost by incorrectly predicting the model.
VIII. CONCLUSIONS AND FURTHER WORK
- The authors have presented a framework for implementing dynamic pricing in an interactive agent-based marketplace.
- The authors showed that for the cases they study, the heuristics yield revenues that are consistently better than that of the best offline-optimized fixed price and the results of various derivative follower algorithms.
- The strategies are also adaptive and robust to market dynamics.
- In both the Brownian and non-Brownian cases, their IB strategy can still perform well with the same optimized parameters when the demand is increased up to 160% of the original configuration, compared to when the parameters are specifically optimized considering the demand change.
Did you find this useful? Give us your feedback