# Performance Optimization of a Parallel, Two Stage Stochastic Linear Program

## Summary (3 min read)

### Introduction

- The authors base their work in the context of a real application: the optimization of US military aircraft allocation to various cargo and personnel movement missions in the face of uncertain demands.
- Keywords-stochastic optimization, parallel computing, large scale optimization, airfleet management I. INTRODUCTION Stochastic optimization provides a means of coping with the uncertainty inherent in real-world systems; and with models that are nonlinear, of high dimensionality, or not conducive to deterministic optimization techniques.
- The solutions obtained can be far from optimal even with small perturbations of the input data.
- Examples include making investment decisions in order to increase profit, transportation (planning and scheduling logistics), design-space exploration in product design, etc.
- The authors present their parallel decomposition and some interesting considerations in dealing with computation-communication granularity, responsiveness, and the lack of persistence of work loads in an iterative setting.

### II. MODEL FORMULATION & APPROACH

- The United States Air Mobility Command (AMC) 1 manages a fleet of over 1300 aircraft [2] that operate globally under uncertain and rapidly changing demands.
- Aircraft are allocated at different bases in anticipation of the demands for several missions to be conducted over an upcoming time period (typically, fifteen days to one month).
- The purpose of a stochastic formulation is to optimally allocate aircraft to each mission such that subsequent disruptions are minimized.
- Note that their formulation of the aircraft allocation model has complete recourse (i.e. all candidate allocations generated are feasible) because any demand (in a particular scenario) that cannot be satisfied by a candidate allocation is met by short term leasing of civilian aircraft at a high cost while evaluating that scenario.
- Tkx (5) The second stage optimization helps Stage 1 to take the recourse action of increasing the capacity for satisfying an unmet demand by providing feedback in the form of additional constraints (cuts) on the Stage 1 LP (6).

### III. PARALLEL PROGRAM DESIGN

- The authors have implemented the program in Charm++ [6], [7], which is a message-driven, objectoriented parallel programming framework with an adaptive run-time system.
- It also emphasizes any sequential bottlenecks and has been causative of some of their efforts in optimizing solve times.
- Since the unit of sequential computation is an LP solve, the two-stage formulation maps readily onto a two-stage parallel design, with the first stage generating candidate allocations, and the second stage evaluating these allocations over a spectrum of scenarios that are of interest.
- An Allocation Generator object acts as the master and generates allocations, while a collection of Scenario Evaluator objects are responsible for the evaluation of all the scenarios.
- Charm++ provides flexibility in the placement of compute objects on processors.

### IV. OPTIMIZING STAGE 1

- The two-stage design yields an allocation that is iteratively evolved towards the optimal.
- As the Stage 1 model grows larger every round, it becomes increasingly limited by the memory subsystem and experiences dilated times for LP solves.
- Cut Usage Rate = num rounds in which cut is active num rounds since its generation (7) We therefore implemented a cut retirement scheme that discards/retires cuts whenever the total number of cuts in the Stage 1 model exceeds a configurable threshold.the authors.the authors.
- The recently used cuts are scored higher.
- For more details and proofs for the weighing function refer to [9].

### V. OPTIMIZING STAGE 2

- This constitutes the major volume of the computation involved in the Benders approach because of the large number of scenarios in practical applications.
- Their experiments show that runs with advanced start take fewer rounds to converge than with a fresh start.
- The authors do not yet have data to back any line of reasoning that can explain this.
- The authors also implement random clustering for reference.
- Figure 10 compares the improvement in average Stage 2 solve times when scenarios are clustered using Algorithm 1.

### VI. SCALABILITY

- With the optimizations described above, the authors were able to scale medium-sized problems up to 122 cores of an Intel-64 Clovertwon (2.33 GHz) cluster with 8 cores per node.
- For 120 scenarios, an execution that uses 122 processors represents the limit of parallel decomposition using the described approach: one Stage 1 object, one Work Allocator object, and 120 Scenario Evaluators that each solve one scenario.
- Figure 13(a) and 13(b) show the scalability plots with Stage 1 and Stage 2 wall time breakdown.
- The plots also demonstrate Amdahl’s effect as the maximum parallelism available is proportional to the number of scenarios that can be solved in parallel, and scaling is limited by the sequential Stage 1 computations.
- It must be noted that real-world problems may involve several hundreds or thousands of scenarios, and their current design should yield significant speedups because of Stage 2 parallelization.

### VIII. SUMMARY

- Most stochastic programs incorporate a large number of scenarios to hedge against many possible uncertainties.
- For stochastic optimization with Benders approach, the vast bulk of computation can be parallelized using a master-worker design described in this paper.
- The authors presented an LRFU based cut management scheme, that completely eliminates the memory bottleneck and significantly reduces the Stage 1 solve time, thus making the optimization of large scale problems tractable.
- Much higher speedups can be obtained for real-world problems which present much more Stage 2 computational loads.
- The authors are currently exploring methods such as Lagrangean decomposition to alleivate this.

### IX. ACKNOWLEDGMENTS

- The research was supported by MITRE Research Agreement Number 81990 with UIUC.
- The Gurobi linear program solver is licensed at no cost for academic use.
- Runs on Abe cluster were done under the TeraGrid [14] allocation grant ASC050040N supported by NSF.

Did you find this useful? Give us your feedback

##### Citations

10 citations

### Cites background or methods from "Performance Optimization of a Paral..."

...It provides a Python based programming framework for developing stochastic optimization models....

[...]

...Ensuring high utilization of compute resources will therefore require interleaving the iterative twostage evaluation of multiple BnB vertices....

[...]

1 citations

1 citations

### Cites background from "Performance Optimization of a Paral..."

...A few examples include military (Langer et al., 2012), energy (Carøe and Schultz, 1998; Wang et al., 2012), finance (Kouwenberg, 2001; Yu et al., 2003), healthcare (Denton et al., 2007; Salmerón and Apte, 2010) and supply chain (Goh et al., 2007; Santoso et al., 2005)....

[...]

...A few examples include military (Langer et al., 2012), energy (Carøe and Schultz, 1998; Wang et al....

[...]

### Cites background from "Performance Optimization of a Paral..."

...Langer et al (Langer et al., 2012) propose clustering schemes for solving similar scenarios in succession that significantly reduces the Stage 2 scenario optimization times by use of advanced/warm start....

[...]

##### References

5,398 citations

1,750 citations

### "Performance Optimization of a Paral..." refers methods in this paper

...Scalability results are presented in Section VI, while we summarize related work in Section VII....

[...]

1,364 citations

### "Performance Optimization of a Paral..." refers background in this paper

...Keywords-stochastic optimization, parallel computing, large scale optimization, airfleet management I. INTRODUCTION Stochastic optimization provides a means of coping with the uncertainty inherent in real-world systems; and with models that are nonlinear, of high dimensionality, or not conducive to…...

[...]

593 citations

### "Performance Optimization of a Paral..." refers background in this paper

...Note that our formulation of the aircraft allocation model has complete recourse (i.e. all candidate allocations generated are feasible) because any demand (in a particular scenario) that cannot be satisfied by a candidate allocation is met by short term leasing of civilian aircraft at a high cost…...

[...]