# A classification-based approach to the optimal control of affine switched systems

## Summary (2 min read)

### Introduction

- Which are hybrid systems where the discrete state identifies the mode of the system and the continuous state is governed by mode–specific affine dynamics.the authors.
- The objective is to compute a state–feedback switching law that minimizes a cost function over an infinite time horizon.
- [7] considers switching costs, but only for finite switching sequences.
- For each state in a training set, a state–action pair is generated based on the estimated value function associated to the current policy.
- To address the classification task, the authors employ the Guaranteed Error Machine (GEM) classifier [17], which provides theoretical guarantees on the probability of classification errors.

### II. PRELIMINARIES AND PROBLEM STATEMENT

- The switching signal at time k may depend on the previous mode of the system, identified by σ(k−1).
- Open-loop policies select the control input (σ(k)) based on the initial state: σ(k) = πol(x(0), q(0), k), i.e. without any direct measurement of the effects of past inputs.
- The authors focus on the infinite–horizon case, for which the optimal control design and implementation are easier.
- Furthermore, only a finite number of initial states can be considered, so that the desired map must be constructed by generalization over the whole state space based on a finite number of samples.

### III. CLASSIFICATION-BASED TWO-STAGE ALGORITHM

- In this section the authors introduce an algorithm to compute offline a closed-loop control policy π̄∗cl that minimizes (7).
- The computed optimal policy is then stored in memory and applied on–line.
- Π̄∗cl is a map from the state to the switching signal.
- The next two sub–sections explain in detail the two main stages of the algorithm.

### A. Data-set generation: the DATAGEN subroutine

- If L is sufficiently large, the finite-horizon cost is a good approximation of the infinite horizon cost and the value obtained for the control input at time 0 approximates the optimal one for the infinite horizon case.
- This optimization problem can be efficiently solved via MIQP [19].
- This equation is nonlinear, since it involves products between states and logical inputs.
- Introducing the auxiliary continuous variables zi(k) ∈ Rn: zi(k) =.
- The cost function can be convexified by adding a correction term that is constant with respect to the optimization variables and hence does not change the optimal switching sequence (details are omitted for brevity).

### B. The GEM classification machine: the LEARN procedure

- The LEARN procedure consists in training a classifier on the data gathered as explained in the previous subsection.
- Interestingly, the derived bound for N is independent of the state space dimension n.
- All instances included into R1 share the same label as x1 and are thus removed from the training set, while the instances on the boundary Ω(R1) of the region are marked as “active” points and added to a set Q.
- For more details on the GEM algorithm refer to [17].
- The specific nature of the considered problem, which deals with affine switching systems, may induce a special structure in the switching regions, also known as Remark 3.2.

### IV. NUMERICAL EXAMPLE

- In the following the proposed classification-based control design methodology is tested on a modified version of a benchmark multi-room heating control problem described in [22].
- A switching control strategy must be designed to decide at each time step which room should be heated, depending on the temperature values in all the rooms.
- To evaluate the performance of the policy obtained by the proposed algorithm (denoted π̄∗GEM in the following), the authors compared it with a standard MPC policy (referred as π̄∗MPC).
- Notice that in the absence of switching costs the optimal policy is independent of the current mode.
- Notice also that, while equation (13) provides a lower bound for the probability that PE(π̄∗GEM ) ≤ , the practical application of the GEM classifier typically leads to better performances.

### V. CONCLUSIONS

- A classification-based approach has been proposed for the optimal control of discrete-time switched affine systems.
- The proposed method operates in two steps.
- First, a number of initial states is drawn from a uniform distribution over the state space, and an optimal control action is associated to each of them.
- Precise bounds can be derived on the generalization capabilities of the classifier, which indirectly affect the control performance.
- Some simulation experiments on a benchmark problem reveal that the difference with respect to a standard MPC policy is small.

Did you find this useful? Give us your feedback

##### Citations

3 citations

2 citations

1 citations

##### References

2,980 citations

### "A classification-based approach to ..." refers background or methods in this paper

...and setting x(k + 1) = ∑m i=1 zi(k), one can transform constraint (10) into a set of mixed-integer linear inequalities by using the so-called “big-M” approach [10]....

[...]

...To this end, it is useful to reformulate system (1) as a Mixed Logical Dynamical (MLD) system [10]....

[...]

...Similarly to [10], we employ Mixed Integer Quadratic Programming (MIQP) to determine the optimal control policy....

[...]

...However, while in [10] MIQP problems are repeatedly solved on–line to determine the optimal control sequence at each time instant (only the first control is applied every time, according to the receding horizon strategy), MIQP is here used for the off–line computation of the switching law....

[...]

1,811 citations

### "A classification-based approach to ..." refers background in this paper

...Indeed, since the system is stationary and the cost function has a time–invariant cost per stage, it turns out that there exists an optimal closed–loop stationary control policy, [18]....

[...]

1,363 citations

### "A classification-based approach to ..." refers background in this paper

...A general framework for the optimal control of switched systems was established by [1] in the context of hybrid systems....

[...]

^{1}

516 citations

372 citations

### "A classification-based approach to ..." refers methods in this paper

...When online optimization is not viable, [11], [12] suggest a multi–parametric programming approach for solving a finite– horizon hybrid optimal control problem in a state–feedback form....

[...]