How can the authors reduce the computation cost of evaluating the cost function in every step?

When times between decision epochs are exponentially distributed the authors can reduce the computation cost by introducing fictitious decision epochs at which no real decision has to be made.

What is the threshold for admission to overlay?

If there is an î for which ∆iVk (̂i, j0) ≤ CLR and ∆iVk (̂i+1, j0) > CLR, then î is the threshold for admission to overlay when there are j0 calls in underlay.

What is the ratio of BR area to RR area?

As the network capacity increases, the ratio of BR area, in which full optimization is performed, relative to the area of AR/RR regions, in which a default action is evaluated, decreases.

What is the idea of the SVI algorithm?

In SVI, the authors assign a default action to every such point based on the region it belongs to, and then in every round of iteration the cost function for that point is updated according to that default action.

What is the idea of the proposed SVI algorithm?

under the operation of a numerical algorithm similar to SVI, in every iteration the changes in cost or decisions can only happen within the border region.

What is the overall result of the optimal control scheme?

The overall result is that the optimal control scheme does not allow for a linear change in system rejection costs to be reflected severely in the average cost.

what is the optimal cost function for a system model?

The authors also assume the following boundary conditionsVk(Cc + 1, j) = ∞ and Vk(−1, j) = 00 ≤ j ≤ Cw Vk(i, Cw + 1) = ∞ and Vk(−1, j) = 00 ≤ i ≤ Cc. (4)The authors show that the optimal policy to minimize the average cost for the system model given in Section II is a 2D thresholdbased policy.

What is the cost of rejecting a call request?

The authors can formally define MINOBJ asMINOBJ : min gπ = ∑L k=1 C (k) R λkP (k) B(1)where C(k)R is the cost of rejecting a call request of class k, λk is the arrival rate of class k calls, P (k) B is the blocking (dropping) probability for that class and L is the total number of call classes.

What is the fictitious call event type?

The authors add a fictitious call event type of 0 which corresponds to call departures with a fictitious decision of a = 0 to be taken at departure events.

(Open Access) Efficient structured policies for admission control in heterogeneous wireless networks (2007) | Amin Farbod

Q: What are the common methods used to solve MDPs?

Several methods such as Value Iteration (VI), Policy Iteration (PI) and Linear Programming (LP) methods are developed to solve general MDP problems [8].

Q: What is the method for determining the optimal admission policy?

efficient numerical methods called Structured Value Iteration (SVI) and Structured Update Value Iteration (SUVI) are proposed to determine the optimal admission policy.

Q: How can the authors find the average cost of a given MDP?

Once the authors have the average cost the authors can use methods such as multidimensional bisection search [22] to find the parameters that minimize it.

Q: What is the average service time for a call?

Service requests (more specifically calls in this work) arrive according to a memoryless Poisson process, and also service times are memoryless.

Efﬁcient Structured Policies for Admission Control in

Heterogeneous Wireless Networks

Amin Farbod and Ben Liang

Department of Electrical and Computer Engineering

University of Toronto

Toronto, Ontario, CANADA

Email: {afarbod,liang}@comm.utoronto.ca

Abstract—In the near future, demand for Heterogeneous

Wireless Networking (HWN) is expected to to increase. QoS

provisioning in these networks is a challenging issue considering

the diversity in wireless networking technologies and the existence

of mobile users with different communication requirements. In

HWNs with their increased complexity, “the curse of dimension-

ality” problem makes it impractical to directly apply the decision

theoretic optimal control methods that are previously used in

homogeneous wireless networks to achieve desired QoS levels.

In this paper, optimal call admission control policies for HWNs

are considered. A decision theoretic framework for the problem

is derived by a dynamic programming formulation. We prove

that for a two-tier wireless network architecture, the optimal

policy has a two-dimensional threshold structure. Further, this

structural result is used to design two computationally efﬁcient

algorithms, Structured Value Iteration and Structured Update

Value Iteration. These algorithms can be used to determine the

optimal policy in terms of thresholds. Although the ﬁrst one

is closer in its operation to the conventional Value Iteration

algorithm, the second one has a signiﬁcantly lower complexity.

Extensive numerical observations suggest that, for all practical

parameter sets, the algorithms always converge to the overall

optimal policy. Further, the numerical results show that the

proposed algorithms are efﬁcient in terms of time-complexity

and in achieving the optimal performance.

Index Terms—Stochastic optimal control, quality of service,

markov processes.

I. INTRODUCTION

Heterogeneous Wireless Networking (HWN) is a major

next-generation networking architecture to support ubiquitous

wireless communications [1]. Current wireless communication

technologies can generally be classiﬁed into two groups:

local and global. Local services provide high-bandwidth and

low latency communication services over a small area, while

global services provide lower data rates to a wider area [2].

No single wireless communication technology is capable of

simultaneously providing high bandwidth to a large number of

mobile users over a wide area. HWN is a wireless networking

paradigm to overcome this limitation. Such networks consist

of several layers of different overlapping wireless networking

technologies such as WiMAX/WiFi integration.

Corresponding Author: Ben Liang, Department of Electrical and Computer

Engineering,University of Toronto,10 King’s College Road, Toronto, Ontario,

M4S 3G4, Email: liang@comm.utoronto.ca, Tel:+1-416-946-8614, Fax: +1-

416-978-4425

This work was supported in part by a grant from LG Electronics and by

Bell Canada through its Bell University Laboratories R&D program.

QoS provisioning in HWNs is a challenging issue con-

sidering the diversity of wireless networking technologies.

Conventionally, call admission control (CAC) schemes are

used in wireless networks to achieve a desired QoS level.

A CAC algorithm decides to accept or reject call or handoff

requests or to reserve resources in a resource-sharing systems.

CAC schemes for homogeneous cellular networks have been

extensively studied. These schemes can be classiﬁed into

near-optimal heuristics [3] [4] and decision-theoretic optimal

methods [5]–[7]. Furthermore, Dynamic Programming (DP)

and Markov Decision Processes (MDP) [8] are used in the

design of optimal CAC algorithms.

However, for almost all realistic modelings of networking

systems, the computational load of ﬁnding an optimal policy

by MDP algorithms is very high. Also, the size of state

space grows exponentially with system capacity. Numerical

methods [8] to solve MDP problems are iterative and as

reported in [9], there is no known strongly-polynomial time

algorithm to solve them. This can hinder the application of

optimal CAC schemes in practical scenarios. As a remedy,

one common modeling approach is to isolate one cell from

the rest of the network to avoid excessive complexity in state

space [10].

A more effective use of DP-based methods is to obtain

structural results for optimal control problems [11]–[15]. In

structural results, a DP formulation is used to characterize

the structure of possible optimal policies. Then, knowledge of

the policy structure can be exploited to design very efﬁcient

numerical methods to ﬁnd the optimal policy. As an example,

in [5], it is shown that the optimal control policy for a single

cellular Base-Station (BS) is the well-known guard-channel

policy. Then, knowing that the guard-channel policy is fully

determined by a single threshold, the authors of [5] propose an

efﬁcient method to ﬁnd it. It has been shown in the literature

that for a large class of optimal control problems the optimal

policy is threshold-based [5] [15].

To the best of our knowledge, there is no existing study

on optimal CAC schemes for heterogeneous networks. Due

to the increased complexity in HWN, direct application of

MDP algorithms is impractical. In this paper, optimal CAC

policies for HWNs are considered and some structural results

are presented. We base our algorithms on theories in optimal

control where dynamic programming methods are used to ﬁnd

the optimal policy to control a random process over time

Underlay BS

00000

11111

Overlay BS

BS Coverage Area

Cluster

hcc

out

hcc

hcw

hwc

Fig. 1. Cluster trafﬁc arrival and departure.

to achieve a certain optimization goal. A decision theoretic

framework for the problem is derived by dynamic program-

ming formulation. In this paper, we limit our focus to a two-

tier wireless network architecture. With some modiﬁcations,

this model can be applied to more complex scenarios. We

prove that for this architecture the optimal policy has a two-

dimensional threshold structure. Further, this structural result

is used to design two computationally efﬁcient algorithms,

Structured Value Iteration (SVI) and Structured Update Value

Iteration (SUVI). These algorithms can be used to determine

the optimal policy in terms of thresholds. Although the ﬁrst

one is closer in its operation to the conventional Value Iteration

(VI) algorithm, the second one has a signiﬁcantly lower

complexity. Extensive numerical observations suggest that, for

all practical parameter sets, the algorithms always converge to

the overall optimal policy.

The rest of the paper is organized as follows. In Section II,

the system model and assumptions are presented. Section III

presents the structural results and discussion on the complexity

of algorithms to solve optimal CAC problems. In Section IV,

the proposed algorithms are explained, and numerical results

are given in section V, followed by concluding remarks in

Section VI.

II. NETWORK MODEL

A HWN can possibly have a complex conﬁguration, in-

volving many different wireless service layers. It is generally

difﬁcult to analytically tract such complicated scenarios to

provide insight into the optimal control of resources in HWNs.

In this work, we assume a two-tier heterogeneous wireless

network architecture consisting of an overlay and an underlay.

This basic 2-tier entity will be called a Cluster. An example

cluster is shown in Fig. 1. There are also neighboring clusters

from which horizontal handovers are possible to this cluster.

We assume tight coupling between different layers of wireless

network [16] in a cluster. In the tight coupling architecture, the

management of different layers is centralized. In what follows,

we assume that there exists a control unit which makes the

CAC decision for the underlay and overlay BSs, and that

clusters act independently and can measure the rate of external

arrival processes such as hand-overs from neighbor clusters.

Note that our mathematical analysis and control algorithms

are independent of underlying wireless technologies as long

as they satisfy some general technical requirements. However,

in the simulation section parameters are chosen with respect

to IEEE 802.16 WiMAX and IEEE 802.11 WLAN standards.

Service requests (more speciﬁcally calls in this work) arrive

according to a memoryless Poisson process, and also service

times are memoryless. Average service times are µ

and

for calls inside overlay and underlay. We also assume a

memoryless mobility pattern where calls move to neighbor

clusters or different layers at exponentially distributed times

with rates given in Table I. It is clear that these assumptions

result in exponential channel holding times [17]. This is an

essential requirement in the application of MDP methods. In

this paper, we focus on call-level QoS, which is common in

CAC literature. Further, the ﬁxed channel allocation (FCA)

scheme is used and C

and C

denote the capacity of overlay

and underlay in terms of the maximum number of calls they

can accommodate. FCA easily applies to various wireless

technologies with channel being frequency, time-slot or code

assignment. Based on the results in [18] and [19], we consider

the case where the number of available voice/multimedia

channels in underlay/overlay can be quantized.

In our event-based DP, we associate costs to undesirable

control decision events. These costs correspond to the drop-

ping or blocking of arriving calls, and they are incurred when

a call admission request is rejected by the cluster control

unit. They reﬂect the degradation in the QoS from the service

provider’s perspective or the inconvenience of service denial

perceived by users. The call and handoff arrival rates and their

corresponding rejection costs are shown in Table I. Throughout

the rest of this paper, every call type is called a class.

In the study of CAC schemes several optimality criteria are

considered. The most common ones are minimization of a

total cost (objective) function and minimization of the blocking

probability given some hard constraints on dropping probabili-

ties. In [5], these are refereed to as MINOBJ and MINBLOCK,

respectively. The main advantage of MINBLOCK lies in

the fact that it can guarantee some upper bounds on the

dropping probabilities. This can also be achieved by MINOBJ

by adjusting cost ratios. Furthermore, MINBLOCK has the

drawback of not taking into account how much resource is

wasted in reservation to achieve those bounds [10]. In this

paper, we focus on MINOBJ for its ﬂexibility. We can formally

deﬁne MINOBJ as

MINOBJ : min g

k=1

(k)

(1)

where C

(k)

is the cost of rejecting a call request of class k,

is the arrival rate of class k calls, P

(k)

is the blocking

(dropping) probability for that class and L is the total number

of call classes.

III. OPTIMAL CAC POLICY

Decision theoretic optimization for Markovian processes is

a well-known stochastic control method [20]. The Markov

property allows signiﬁcant reduction in tabular programming

complexity and in some cases makes it possible to obtain

structural results. An MDP is determined by four components:

# Rate Rejection Cost Description

1 λ

N BC

New calls to Overlay

2 λ

N BW

New calls to Underlay

3 η

hcc

HDC C

Handoff to Overlay from Overlay

4 η

hcw

HDC W

Handoff to Overlay from Underlay

5 η

hwc

HDW C

Handoff to Underlay from Overlay

TABLE I

CALL ARRIVAL/HANDOFF RATES AND REJECTION COSTS.

state space S, action space A, state transition probabilities

P ( .), and a cost function C(.). The performance criteria can

be formulated with respect to ﬁnite or inﬁnite horizons and

for average-cost or discounted-cost problems. The solution to

an MDP is called a policy or rule. A policy maps the state

space to actions π : S → A, such that the optimization goal

is achieved. A large class of policies, in which the decision is

independent of time, are called stationary policies.

In this work, we wish to minimize the average expected cost

for an inﬁnite-horizon problem. This reﬂects our concern about

long-run QoS performance. We start with a ﬁnite-horizon

optimal cost function and we show that the solution to the

inﬁnite-horizon problem has the same structure. Let us denote

by V

(i, j) the optimal cost function for a k-stage problem

with the initial state (i, j) where i is the number of calls in

overlay and j is the number of calls in underlay at the start of

the decision epoch. Using the uniformization technique [21],

we can write V

k+1

(i, j) recursively as

k+1

(i, j) =

max

min(V

(i, j) + C

NBC

, V

(i + 1, j))

max

min(V

(i, j) + C

NBW

, V

(i, j + 1))

hcc

max

min(V

(i, j) + C

HD CC

, V

(i + 1, j))

iη

hcw

max

min(V

(i − 1, j) + C

HD CW

, V

(i − 1, j + 1))

jη

hwc

max

min(V

(i, j − 1) + C

HD W C

, V

(i + 1, j − 1))

iµ

max

(i − 1, j) +

jµ

max

(i, j − 1)

out

hcc

max

(i − 1, j) + (1 −

out

(i, j)

max

(i, j) (2)

where v

out

(i, j) is the rate of going out of state s = (i, j),

out

(i, j)=λ

+ λ

hcc

+ λ

out

hcc

+ iη

hcw

+ jη

hwc

+ iµ

+ jµ

, (3)

out

hcc

= iη

hcc

, and v

max

is the uniformization parameter such

that v

max

≥ v

out

(i, j) for every (i, j) pair. Since v

out

(i, j) is

increasing in i and j, we choose v

max

= v

out

, C

). Equa-

tion (2) consists of nine terms, each reﬂecting one possible

event; the ﬁrst three terms reﬂect arrivals to the cluster, the

fourth and ﬁfth terms account for vertical handovers, the next

three terms are for departure events and the last term is due to

the uniformization technique where staying in the same state

is possible. We also assume the following boundary conditions

+ 1, j) = ∞ and V

(−1, j) = 00 ≤ j ≤ C

(i, C

+ 1) = ∞ and V

(−1, j) = 00 ≤ i ≤ C

. (4)

A. Optimality of Threshold-Based Policy

We show that the optimal policy to minimize the average

cost for the system model given in Section II is a 2D threshold-

based policy. In a single threshold system, that threshold is

independent of the system state. When the system state is more

complex, such as in the HWNs case, the threshold for the

operation of one system component might depend on the state

of another one. In our scenario, it gives rise to a 2D threshold

structure.

From (2), it can be seen that when a call of class L arrives,

it is only admitted if V

, j

) − V

(i, j) ≤ C

, where state

s = (i, j) is the current state, state t = (i

, j

) is the next state

if we admit the call, and C

is the rejection cost for class L.

Let us deﬁne two difference operators for V

(i, j),

∆

(i, j) = V

(i, j) − V

(i − 1, j)

∆

(i, j) = V

(i, j) − V

(i, j − 1). (5)

For every ﬁxed j there is a sequence of ∆

(i, j) for i =

1 . . . C

, and vice versa. In what follows we claim that the

sequences of ∆

(i, j) and ∆

(i, j) are increasing in i

and j, respectively. For the proof refer to the Appendix.

Lemma 1: V

(i, j) is convex and monotonically non-

decreasing in i (or j) for every ﬁxed j (or i).

It has been shown in [21] that for average-cost problems

with ﬁnite S and A, the optimal policy is stationary. Further,

we are only interested in stationary policies which result in

irreducible chains. The chain deﬁned by V

(i, j) is also aperi-

odic since it contains loops into the same state. According to

Theorem (6.6.2) in [21], for irreducible and aperiodic markov

decision processes the difference of upper and lower bounds

of V

k+1

(i, j) − V

(i, j) converges to the optimal average cost

per unit time when k → ∞. Also, Theorem (6.6.1) in [21]

implies that the optimal per-unit-time average cost function has

the same structure as V

(i, j) deﬁned in (2). Hence, structural

results on V

(i, j) would directly hold for the inﬁnite-horizon

per-unit-time cost function.

Theorem 1: A 2D threshold-based policy is an optimal

solution to the control problem with the system model given

in (2).

Proof: Without loss of generality, let us assume that a

call of class L arrives to overlay when the system state is

s = (i − 1, j

). The proof for arrivals to underlay is similar.

If the call is admitted, increase in the optimal cost function is

∆

(i, j

). We show that the CAC decision can be expressed

in terms of thresholds determined by ∆

(i, j

) and C

From Lemma 1, we know that the sequence of ∆

(i, j

)

is increasing in i. If there is an

i for which ∆

(

i, j

) ≤ C

and ∆

(

i+1, j

) > C

, then

i is the threshold for admission

to overlay when there are j

calls in underlay. Otherwise, if

for every

i ∈ {1, . . . , C

} we have that ∆

(

i, j

) ≤ C

then that call is of high priority and it is only rejected when

the system is full. Also, if for every

i, ∆V

(

i, j) > C

then

that call class is of low priority and it is never admitted to the

system.

Note that for every call class of L, threshold

i depends on

∆

(

i, j

) which in turn depends on j

. This implies that the

threshold for overlay operation depends on the underlay state.

Therefore, the optimal control policy has to be 2D threshold-

based to account for this correlation.

B. CAC Algorithm

We denote by π = hTh

, M], Th

, N]i the class

of threshold-based polices. Here, M is the number of call

classes entering overlay and N is the number of call classes

entering underlay. Every class within M or N would be called

a subclass. In our scenario M is 3 and N is 2. The CAC

algorithm when system state is s = (i, j) at the arrival epoch

and policy π is employed is given in Algorithm 1. When a

call of subclass L

arrives to overlay (underlay), it is only

admitted if the number of active calls in overlay (underlay)

is less than the threshold for that call-type. This threshold is

a function of call subclass and number of calls in the other

network underlay (overlay).

A CAC algorithm is fully determined given policy π in

terms of thresholds. However, ﬁnding these values is a non-

trivial problem. Efﬁcient computation of these thresholds is

considered in the next section.

Algorithm 1 2D Threshold-Based CAC

Input: π = hTh

, M], Th

, N]i

A Call of class L arrives

It belongs to subclass L

Output: Admission Decision

1: if Arrival to overlay then

2: if i < Th

(j, L

) then

3: return Admit

4: else

5: return Reject

6: end if

7: else {Arrival to underlay}

8: if j < Th

(i, L

) then

9: return Admit

10: else

11: return Reject

12: end if

13: end if

C. Finding Policy π

A major requirement for CAC algorithms is their adaptivity

to network trafﬁc dynamics. Since this is generally achieved

by periodically updating the admission policy, the algorithm

computational load has to be minimal. Depending on the

system size, the computation cost of solving a general MDP

can be very high. Several methods such as Value Iteration (VI),

Policy Iteration (PI) and Linear Programming (LP) methods

are developed to solve general MDP problems [8].

According to [9], no strongly-polynomial algorithm is

known for solving MDPs. Although MDPs can be solved by

conversion to LP problems, polynomial-time algorithms for

LP are inefﬁcient and impractical. On the other hand, practical

LP algorithms can result in exponential time-complexity in the

worst case when used to solve MDPs. Consequently, there are

no efﬁcient and practical polynomial-time algorithms to solve

MDPs. Therefore, the computation cost of ﬁnding thresholds

for the optimal policy can be a burden if we use any of

these techniques. However, when we know about the optimal

solution structure, we might be able to exploit this knowledge

to solve the problem more efﬁciently.

Generally, either direct or indirect methods can be employed

to ﬁnd the CAC parameters, i.e., policy thresholds. Direct

methods require calculating the average cost for a given

policy π

. This can be done by modeling the system as

a continuous markov chain (CTMC). Note that every MDP

problem given a policy π

can be analyzed as a Markov

chain. Then Gaussian elimination-like methods can be used

to ﬁnd state probabilities and to calculate the average cost.

Once we have the average cost we can use methods such as

multidimensional bisection search [22] to ﬁnd the parameters

that minimize it. The problem with this method is that for a

two-tier network each having capacity n, the size of Markov

chain state space would be O(n

) and Gaussian elimination

would take O((n

)

) = O(n

However, as explained previously, CAC algorithms have to

be light weight to be of any practical use. In indirect methods

we avoid a direct evaluation of cost function. Instead we use

an iterative approximation. Along with that, we use our prior

knowledge of optimal policy structure to further improve the

algorithm time-complexity.

IV. EFFICIENT COMPUTATIONAL ALGORITHMS

In this section, we introduce efﬁcient computational algo-

rithms to ﬁnd the optimal CAC policy. We ﬁrst describe the

conventional Value Iteration (VI) algorithm. We then propose

two efﬁcient algorithms called Structured Value Iteration (SVI)

and Structured Update Value Iteration (SUVI). The basic

principle of these algorithms is similar to VI. However, we

use our prior knowledge of the optimal solution structure to

eliminate unnecessary computations.

A. Conventional VI Algorithm

Conventional Value Iteration (VI) algorithm is based on the

Bellman-Ford iterative equation [8],

(s) = min

a∈A(s)

(a) +

t∈S

(a)V

n−1

(t)}. (6)

Note that this equation is backward in time, such that V

(.) is

the cost at the end of the process. In every iteration V

(s) is

calculated for ∀s ∈ S. Here, S is the state space, and A(s) is

the set of possible actions at state s. P

(a) is the transition

probability of going form s to t having taken action a, and

(a) is the cost of taking action a in state s. In our model,

the system state has two components, the number of calls in

overlay i and the number of calls in underlay j; s = (i, j). For

every incoming call, either new or hand-off, at any state two

actions are possible: accept (denoted by 1) or reject (denoted

by 0); A(s) = {0, 1}.

To ﬁnd the state transition probabilities P

(a), we use

ﬁctitious decision epochs [21]. The computational load of

evaluating the cost function in every step highly depends

on the density of the P

(a) matrix. When times between

decision epochs are exponentially distributed we can reduce

the computation cost by introducing ﬁctitious decision epochs

at which no real decision has to be made. These correspond to

departure events when no action is taken. By this technique,

at every decision epoch either real or ﬁctitious, the system

state can only change to adjacent states, making many terms

in P

(a) zero. However, to keep track of the epoch type

we have to extend the state space by one dimension. The

increased computation cost due to this enlarged state-space

is compensated by the reduction in the P

(a) density.

We deﬁne the new state variable to be a triple s = (i, j, k).

Here, k is the departure or arrival type. We already have 5

call types from Table I. We add a ﬁctitious call event type

of 0 which corresponds to call departures with a ﬁctitious

decision of a = 0 to be taken at departure events. In addition,

since the decision epochs can be at any randomly distributed

time, a Semi-Markov Decision Process (SMDP) model need

to be used [21]. Again, we take the uniformization rate to be

max

= v

out

, C

). Also, we have to determine v

(a), the

rate of going out of state s having taken action a. Here we

give the transition probabilities for some of the state-action

combinations in terms of transition rates with P

(a) =

(a)

and s = (i, j, k):

(a = 1) =











(i + 1)( η

hcc

+ µ

) t = (i, j, 0)

jµ

t = (i + 1, j − 1, 0)

t = (i + 1, j, 1)

t = (i + 1, j, 2)

hcc

t = (i + 1, j, 3)

(i + 1)η

hcw

t = (i + 1, j, 4)

jη

hwc

t = (i + 1, j, 5)

(7)

for k ∈ {1, 3} and v

(a = 1) = v

out

(i + 1, j) and v

out

(i, j)

given in (3). Another example for k = 4 is

(a = 0) =











(i − 1)( η

hcc

+ µ

) t = (i − 2, j, 0)

jµ

t = (i − 1, j − 1, 0)

t = (i − 1, j, 1)

t = (i − 1, j, 2)

hcc

t = (i − 1, j, 3)

(i − 1)η

hcw

t = (i − 1, j, 4)

jη

hwc

t = (i − 1, j, 5)

(8)

with v

(a) = v

out

(i − 1, j). Note that in the above, a hand-off

request from overlay to underlay was initially rejected (a = 0)

leaving only i − 1 calls in overlay at the start of the decision

epoch. We specify the boundary conditions by deﬁning

+ 1, j, k) = ∞ for all j and k

(i, C

+ 1, k ) = ∞ for all i and k

(−1, j, k) = 0 for all j and k

(i, −1, k) = 0 for all i and k (9)

0 5 10 15 20 25 30

CELL State i

WLAN State j

Admission Region

Rejection Region

Border Region

Fig. 2. SVI algorithm operation; AR, BR and RR for calls coming to underlay

and D = 1.

For SMDP, (6) needs to be modiﬁed to reﬂect the semi-Markov

state transition rates. We deﬁne operator T

[V (.), s, a] as

[V (.), s, a]=c

(a)v

(a)

max

t∈S

(a)V (t)

1 −

(a)

max

V (s). (10)

Given this operator we can rewrite (6) for SMDPs as

(s)= min

a∈A(s)

n−1

, s, a]}. (11)

B. SVI Algorithm

Theorem 1 states that the optimal solution is a 2D threshold

policy, implying that the admission region for any call type

should be a closed area. An example of this is shown in Fig. 2.

For any given policy π

and call subclass, we can partition

the state space into three disjoint areas, called Accept-Region

(AR), Border-Region (BR), and Reject-Region (RR). We deﬁne

the region indicator function I

(s, p) for state s = (i, j) and

call request of subclass p as

(s, p) =







AR i − Th

(j, p) < −D

BR |i − Th

(j, p)| ≤ D

RR i − Th

(j, p) > D.

(12)

If a state is within distance D of the threshold level then

it is in BR. D acts as a tuning parameter, determining the

size of area we are willing to re-evaluate in every iteration.

An example of I

(s, p) classiﬁcation is shown in Fig. 2 for

D = 1, where dotted states correspond to the threshold levels.

The indicator function for the underlay subclasses is similar.

Given the indicator function I

(s, p), we can redeﬁne the

action space A(s) as A

(s),

(s) =







{0} if I

(s, p) = RR

{1} if I

(s, p) = AR

{0, 1} if I

(s, p) = BR.

(13)

Here, we are limiting the set of possible actions. The idea

is that for states inside the admission region it would be

unnecessary to consider a possible reject action if they are

not close to the border. Note that the cost function evaluation

Efficient structured policies for admission control in heterogeneous wireless networks

Figures

Citations

Blocking probabilities of elastic and adaptive calls in the Erlang multirate loss model under the threshold policy

Performance metrics of a multirate resource sharing teletraffic model with finite sources under the threshold and bandwidth reservation policies

A hybrid (N/M)CHO soft/hard vertical handover technique for heterogeneous wireless networks

Structured Admission Control Policy in Heterogeneous Wireless Networks with Mesh Underlay

An Erlang multirate loss model supporting elastic traffic under the threshold policy

References

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes

Applied Probability Models with Optimization Applications

A First Course in Stochastic Models

Vertical handoffs in wireless overlay networks

Related Papers (5)

A Theory of QoS for Wireless

Optimal Dynamic Multicast Scheduling for Cache-Enabled Content-Centric Wireless Networks

Distributed cross-layer algorithms for the optimal control of multihop wireless networks

Asymptotically optimal transmission policies for large-scale low-power wireless sensor networks

Dynamic Virtual Resource Allocation for 5G and Beyond Network Slicing

Frequently Asked Questions (15)

Q1. What are the contributions mentioned in the paper "Efficient structured policies for admission control in heterogeneous wireless networks" ?

Q2. How can the authors reduce the computation cost of evaluating the cost function in every step?

Q3. What are the common methods used to solve MDPs?

Q4. What are the common optimality criteria in CAC?

Q5. What is the method for determining the optimal admission policy?

Q6. How can the authors find the average cost of a given MDP?

Q7. What is the average service time for a call?

Q8. What is the threshold for admission to overlay?

Q9. What is the ratio of BR area to RR area?

Q10. What is the idea of the SVI algorithm?

Q11. What is the idea of the proposed SVI algorithm?

Q12. What is the overall result of the optimal control scheme?

Q13. what is the optimal cost function for a system model?

Q14. What is the cost of rejecting a call request?

Q15. What is the fictitious call event type?