scispace - formally typeset
Open AccessBook ChapterDOI

Stochastic real-time games with qualitative timed automata objectives

TLDR
It is proved that whenever player □ has a winning strategy, then she also has a strategy that can be specified by a timed automaton, and an exponential-time algorithm is given which computes a winning timing automaton strategy if it exists.
Abstract
We consider two-player stochastic games over real-time probabilistic processes where the winning objective is specified by a timed automaton. The goal of player □ is to play in such a way that the play (a timed word) is accepted by the timed automaton with probability one. Player ⋄ aims at the opposite. We prove that whenever player □ has a winning strategy, then she also has a strategy that can be specified by a timed automaton. The strategy automaton reads the history of a play, and the decisions taken by the strategy depend only on the region of the resulting configuration. We also give an exponential-time algorithm which computes a winning timed automaton strategy if it exists.

read more

Content maybe subject to copyright    Report

}w !"#$%&'()+,-./012345<yA|
FI MU
Faculty of Informatics
Masaryk University Brno
Stochastic Real-Time Games with
Qualitative Timed Automata Objectives
by
Tomáš Brázdil
Jan Krˇcál
Jan Kˇretínský
Antonín Kuˇcera
Vojtˇech
ˇ
Rehák
FI MU Report Series FIMU-RS-2010-05
Copyright c 2010, FI MU August 2010

Copyright c 2010, Faculty of Informatics, Masaryk University.
All rights reserved.
Reproduction of all or part of this work
is permitted for educational or research use
on condition that this copyright notice is
included in any copy.
Publications in the FI MU Report Series are in general accessible
via WWW:
http://www.fi.muni.cz/reports/
Further information can be obtained by contacting:
Faculty of Informatics
Masaryk University
Botanická 68a
602 00 Brno
Czech Republic

Stochastic Real-Time Games with Qualitative
Timed Automata Objectives
Tomáš Brázdil Jan Kr
ˇ
cál Jan K
ˇ
retínský
Antonín Ku
ˇ
cera
Vojt
ˇ
ech
ˇ
Rehák
Faculty of Informatics, Masaryk University,
Botanická 68a, 60200 Brno,
Czech Republic
{brazdil, krcal, kucera, rehak}@fi.muni.cz
jan.kretinsky@in.tum.de
December 13, 2010
Abstract
We consider two-player stochastic games over real-time probabilistic processes where
the winning objective is specified by a timed automaton. The goal of player is to play in
such a way that the play (a timed word) is accepted by the timed automaton with probability
one. Player ^ aims at the opposite. We prove that whenever player has a winning strat-
egy, then she also has a strategy that can be specified by a timed automaton. The strategy
automaton reads the history of a play, and the decisions taken by the strategy depend only on
the region of the resulting configuration. We also give an exponential-time algorithm which
computes a winning timed automaton strategy if it exists.
1 Introduction
In this paper, we study stochastic real-time games (SRTGs) which are obtained as a natural
game-theoretic extension of generalized semi-Markov processes (GSMP) [1, 2, 3] or real-time
The authors are supported by the Alexander von Humboldt Foundation (T. Brázdil), the Institute for Theoret-
ical Computer Science, project No. 1M0545 (J. Kr
ˇ
cál), Brno Municipality (J. K
ˇ
retínský), and the Czech Science
Foundation, grants No. P202/10/1469 (A. Ku
ˇ
cera), No. 201/08/P459 (V.
ˇ
Rehák), and No. 102/09/H042 (J. Kr
ˇ
cál).
On leave at TU München, Boltzmannstr. 3, Garching, Germany.
1

probabilistic processes (RTP) [4]. Intuitively, all of these formalisms model systems which re-
act to certain events, such as message receipts, subsystem failures, timeouts, etc. A common
characteristic of all events is that they are delayed (it takes some time before an initiated event
actually occurs) and concurrent (there can be several previously initiated events that are currently
awaited). For example, if two messages e and e
0
are sent, it takes some (random) time before
they arrive, and one can specify, or approximate, the densities f
e
, f
e
0
of their arrival times. When
e arrives (say, after 20 time units), the system reacts to this event by changing its state, and awaits
e
0
in a new state. The arrival time of e
0
in the new state is measured from zero again, and its
density f
e
0
|20
is obtained from f
e
0
by incorporating the condition that e
0
is delayed for at least 20
time units. That is, f
e
0
|20
(x) = f
e
(x + 20)/
R
20
f
e
(y) dy. Note that if the delays of all events are
exponentially distributed, then f
e
= f
e|b
for every b R
0
, and thus we obtain continuous-time
Markov chains (see, e.g., [5]) and continuous-time stochastic games [6, 7] as restricted forms of
RTPs and SRTGs, respectively.
Intuitively, a SRTG is a finite graph (see Fig. 1) with three types of nodes—states (drawn as
large circles), controls, where each control can be either internal or adversarial (drawn as boxes
and diamonds, respectively), and actions (drawn as small filled circles). In each state s, there
is a finite subset E(s) of events scheduled in s (the events scheduled in s are those which are
“awaited” in a given state; the other events are disabled. Each state s can react to every event of
E(s) by entering a designated control c, where player or player ^ chooses some of the available
actions. Each action is associated with a fixed probability distribution over states. In general,
both players can use randomized strategies, which means that they do not necessarily select just
a single action but a probability distribution over the available actions, which is multiplied with
the distributions associated to actions. Then, the next state is chosen randomly according to the
constructed probability distribution, and the play goes on. Whenever a new state s
0
is entered
from a previous state s along a play, each event scheduled in s
0
is assigned a new delay which is
chosen randomly according to the corresponding (conditional) density. The state s
0
then “reacts”
to the event with the least delay (under the assumptions adopted in this paper, the probability of
assigning the same delay to dierent events is zero).
Our contribution. In this work we consider SRTGs with deterministic timed automata
(DTA) objectives. Intuitively, a timed automaton “observes” a play of a given SRTG and checks
that certain timing constraints are satisfied. A simple example of a property that can be en-
coded by a DTA is “whenever a new request is generated, it is either serviced within the next
10 time units, or the system eventually enters a safe state”. In this case, we want to setup the
internal controls so that the above property holds for almost all plays, no matter what deci-
2

e
1
e
2
1
0.3
0.7
e
3
e
1
e
2
e
1
0.6
0.4
0.5
0.5
Figure 1: An example of a stochastic real-time game
sions are taken in adversarial controls. Hence, the aim of player is to maximize the proba-
bility that a play is accepted by a given timed automaton, while player ^ aims at the opposite.
By applying the result of [8], we obtain that SRTGs with DTA objectives have a value, i.e.,
sup
σ
inf
π
P
σ
= inf
π
sup
σ
P
σ
, where σ and π range over all strategies of player and player ^,
and P
σ
is the probability of all plays satisfying a given DTA objective. This immediately raises
the question whether the players have optimal strategies which guarantee the equilibrium value
against every strategy of the opponent. We show that the answer is negative. Then, we con-
centrate on the qualitative variant of the problem, which is perhaps most interesting from the
practical point of view. An almost-sure winning strategy for player is a strategy such that
for every strategy of player ^, the probability of all plays satisfying a given DTA objective is
equal to one. The main result of this paper is the following: We show that if player has some
almost-sure winning strategy, then she also has a DTA almost-sure winning strategy, which can
be encoded by a deterministic timed automaton A constructable in exponential time. The au-
tomaton A reads the history of a play, and the decision taken by the corresponding DTA strategy
depends only on the region of the resulting configuration entered by A.
Our constructions and proofs are combinations of standard techniques (used for timed au-
tomata and finite-state games) and some new non-trivial observations that are specific for the
considered model of SRTGs. We also adapt some ideas presented in [4] (in particular, we use the
concept of δ-separation).
Related work. Continuous-time (semi)Markov chains are a classical and deeply studied
model with a mature mathematical theory (see, e.g., [5, 9]). Continuous-time Markov decision
processes (CTMDPs) [10, 11, 12] combine probabilistic and non-deterministic choice, but all
events are required to be exponentially distributed. Two player games over continuous-time
Markov chains were considered only recently [6, 7]. Timed automata [13] were originally in-
troduced as a non-stochastic model with time. Probabilistic semantics of timed automata was
3

Citations
More filters

Probability and Measure

P.J.C. Spreij
Book ChapterDOI

Efficient CTMC model checking of linear real-time objectives

TL;DR: It is shown that verifying 1-clock DTA can be done by analyzing subgraphs of the product of CTMC C and the region graph of DTA A and improves upon earlier results and allows to only use standard analysis algorithms.
Book ChapterDOI

The Value of Attack-Defence Diagrams

TL;DR: Attack-Defence Diagrams are introduced as a formalism to describe intricate attack-defence scenarios that can represent the timing and success chances of single attack steps, and concurrent countermeasures of the defender, and enables an efficient what-if quantitative evaluation to deliver cost and success estimates.
Proceedings ArticleDOI

Verification of Open Interactive Markov Chains

TL;DR: This paper embeds the IMC into a game that is played with the environment and devise algorithms that enable it to derive bounds on reachability probabilities that are assured to hold in any composition context.
Book ChapterDOI

Optimizing Performance of Continuous-Time Stochastic Systems Using Timeout Synthesis

TL;DR: It is shown that under mild assumptions, optimal values of parameters can be effectively approximated using translation to a Markov decision process MDP whose actions correspond to discretized values of these parameters.
References
More filters
Book

Dynamic Programming and Optimal Control

TL;DR: The leading and most up-to-date textbook on the far-ranging algorithmic methododogy of Dynamic Programming, which can be used for optimal control, Markovian decision problems, planning and sequential decision making under uncertainty, and discrete/combinatorial optimization.
Journal ArticleDOI

A theory of timed automata

TL;DR: Alur et al. as discussed by the authors proposed timed automata to model the behavior of real-time systems over time, and showed that the universality problem and the language inclusion problem are solvable only for the deterministic automata: both problems are undecidable (II i-hard) in the non-deterministic case and PSPACE-complete in deterministic case.
Book

Probability and Measure

TL;DR: In this paper, the convergence of distributions is considered in the context of conditional probability, i.e., random variables and expected values, and the probability of a given distribution converging to a certain value.
Book

Stochastic Processes

Related Papers (5)
Frequently Asked Questions (9)
Q1. What contributions have the authors mentioned in the paper "Stochastic real-time games with qualitative timed automata objectives" ?

The authors consider two-player stochastic games over real-time probabilistic processes where the winning objective is specified by a timed automaton. The authors prove that whenever player has a winning strategy, then she also has a strategy that can be specified by a timed automaton. 

The operator “+s t” adds t to all clocks stored in ξ and to all events scheduled in s, and (e ∪ X) := ~0 resets all clocks of X to zero and assigns zero delay to e. 

A probability measure over a measurable space (Ω,F ) is a function P : F → R≥0 such that, for each countable collection {Xi}i∈I of pairwise disjoint elements of F , P(⋃i∈I Xi) = ∑i∈I P(Xi), and moreover P(Ω) = 1. 

Being away from the boundaryby a fixed δ then intuitively guarantees that any region that is reachable in one step is reachablewith a probability bounded from below. 

A commoncharacteristic of all events is that they are delayed (it takes some time before an initiated eventactually occurs) and concurrent (there can be several previously initiated events that are currently awaited). 

E the authors have that the conditional probability of delaying all events in E ′ for at least b + t under the condition thatall events in E ′ are delayed for at least b is equal to ∏ e∈E ′ ∫∞ t fe|b(x) dx. 

The probability that e is assigned a delay at most 1− εin s1 is 1 − ε, and hence the constructed DFA accepts a play with probability 1 − ε. 

the possible waiting time that lead us to thatregion lies in an interval that has length at least δ, and the probability that an event happensduring an interval of this minimal size is bounded from below. 

A simple example of a property that can be en-coded by a DTA is “whenever a new request is generated, it is either serviced within the next10 time units, or the system eventually enters a safe state”.