• Z-° be a random variable over the underlying probability space of , raW,... such that E(t) is finite and t is a stopping time, i.e., for all k e Z-° the occurrence of the event x — k depends only on the values , . Then Azuma's inequality states that for
every b > 0 we have that both £P(mW - > b) and ^(raM - m^> < -b) are bounded by
and the optional stopping theorem guarantees that E(raM) — E(m(°)).
The semantics of probabilistic pushdown automata is defined in terms of discrete-time Markov chains, which are recalled in our next definition.
Definition 2 A (discrete-time) Markov chain is a triple M — (S, —>•, Prob) where S is a finite or countably infinite set of vertices, —>• C S x S is a total transition relation, and Prob is a function which to each transition s^t of M assigns its probability Prob(s^t) e (0,1] so that for every jeSwe have Y,s^tProb(s^t) — 1.
In the rest of this paper we also write s to indicate that s^t and Prob(s^t) — x. To every s e S we associate the probability space (Run(s), &, £P) where & is the (7-field generated by all basic cylinders Run(w) where w G FPath(s), and & : & —>• [0,1] is the unique probability function such that £P(Run(w)) — jjiertw) wjjere w(i— 1)^hw(i) for every 1 < i < len(w) (the empty product is equal to 1).
2.2 First-Order Theory of the Reals
At many places in this paper, we rely on decision procedures for various fragments of (M, +,*,<), i.e., first-order theory of the reals (also known as "Tarski algebra"). Given a closed first-order formula (xi/d\,... ,xn/dn) holds iff d{ — c,- for all 1 < i < n. (Here 4>(xi/d\,... ,xn/dn) is the closed formula obtained from
qa is a rule of A.
Since pPDA configurations are strings over a finite alphabet, we can interpret sets of configurations as languages. For our purposes, regular sets of configurations are particularly important.
Definition 5 Let %' C Q x F* be a set of configurations. We say that %' is regular if there is a deterministic finite-state automaton (DFA) .2/ over the alphabet Q UF such
The "BPA" acronym stands for Basic Process Algebra and is used mainly for historical reasons.
Probabilistic Pushdown Automata
7
that pa e %' iff the reverse of pa is accepted by .2/. Further, we say that %' is simple if there is a set JMC C g x F such that /?gj e ^ iff a ^ e and the head of pa belongs to Jf.
Remark 1 Since the DFA ,c/ of Definition 5 reads the stack in the bottom-up direction, one can easily simulate ,c/ on-the-fly in the stack alphabet of A. Formally, we construct another pPDA A' which has the same control states as A, the stack alphabet of A' is r' — (F x A) UZo, where Zq is a fresh bottom-of-the-stack marker and A is the set of control states of ,c/, and the rules of A' simulate the execution of For example, if pX qYZ is a rule of A, then for every a e A we add the rule p(X,a)c^q(Y,a')(Z,a) to A', where aa' is a transition in .2/. Obviously, there is a bijective correspondence between Run(M&,pX) and Run(M^/,p(X,ao)Zo), where «0 is the initial state of ,c/. Note that a configuration qYa of A is accepted by ,c/ iff the corresponding configuration q(Y,a)a'Zo of A' satisfies a Aa'-%aj where ay is an accepting state of ,c/. Hence, the regular set of configurations of A encoded by ,c/ corresponds to a simple set of configurations in A' represented by an efficiently constructive set j$? C Q x F'. In particular, note that qe is recognized by ,c/ iff qZo e j$?, which also explains the role of the symbol Zq. Since the size of A' is polynomial in the size of A and ,c/, the above described construction is a generic technique for extending results about simple sets of configurations to regular sets of configurations without any complexity blowup. We use this simple principle at many places in this paper.
Important subclasses of pPDA are stateless pPDA (also known as pBPA) which do not have control states, and probabilistic one-counter automata (pOC) where the stack alphabet has just one symbol (apart from the bottom-of-the-stack marker) and can be interpreted as a counter.
Definition 6 A pBPA is a triple A — (r,^, Prob) where F is a finite stack alphabet, ^ C F x F* is a finite set of rules such that
- for every X e T there is at least one rule of the form X ^ a,
- for every ruleX^ffwe have that len(a) < 2,
and Prob is a probability assignment which to each rule X ^ a assigns its probability Prob(X ^ a) e (0,1] so that for all X e F we have that Y.x^a Prob(X ^ a) — 1. A configuration of A is an element of F*.
Note that each pBPA A can be understood as a pPDA with just one control state p which is omitted in the rules and configurations of A. Thus, all notions introduced for pPDA can be adapted to pBPA. In particular, each pBPA A determines a unique Markov chain where F* is the set of vertices and the transitions are determined in the expected way.
Example 1 Consider a pBPA A with two stack symbols I, A and the rules
0.5 0.5 1
/-->•£, I^AI, A --)•//. A fragment of is shown in Fig. 1.
8
Tomáš Brázdil et al.
Fig. 1 A fragment of MA.
A formal definition of pOC adopted in this paper is consistent with the one used in recent works such as [11,15,10].
Definition 7 A pOC is a tuple A = (Q, 8=0,8>Q,P=Q,P>0), where
- Q is a finite set of control states,
- 8>0 C Q x {-1,0,1} x Q and 8=0 C Q x {0,1} x Q are the sets of positive and zero rules such that each p e Q has an outgoing positive rule and an outgoing zero rule;
- P>0 and P=0 are probability assignments; both assign to each p G g a positive probability distribution over the outgoing transitions in c)>0 and 8=0, respectively, of p.
We say that A is zero-trivial if (5=0 = {(p,0,/?) | p G 2}. A configuration of 4 is a pair p(i) G Q x Z^°.
The Markov chain associated to A has Q x Z-° as the set of vertices, and the transition are determined as expected. Note that the transitions enabled at p(0) are not necessarily enabled at p(i) where i > 0, and vice versa. If A is zero-trivial, then the only transition enabled at p(0) is the loop p(0)-^>p(0). Also observe that p(i) Ap(i+c) iff p(j) Ap(j+c) for all i, j > 0 and c e {-1,0,1}.
Each pOC can also be understood as a pPDA with just two stack symbols / and Z, where Z marks the bottom of the stack, and the number of pushed /'s represents the counter value. The translations between the two models are linear as long as the counter changes are bounded (as in Definition 7) or encoded in unary.
2.4 The Problems of Interest
In this section we formally introduce the main concepts and notions used in performance and dependability analysis of probabilistic systems modeled by discrete-time Markov chains. In the next sections we show how to solve the associated algorithmic problems for pPDA and its subclasses.
For the rest of this section, we fix a Markov chain M — (S, —>•, Prob) where S is finite or countably infinite.
Reachability and termination. Let s G S be an initial vertex and T C S a set of target vertices. Let Reach(s, T) be the set of all w G Run(s) such that w(i) G T for some i G Z-° (sometimes we also write Reach.M(s,T) to prevent confusions about
Probabilistic Pushdown Automata
9
the underlying Markov chain M). The probability of reaching T from s is defined as (Reaches,T)).
For pPDA and its subclasses, we also distinguish a special form of reachability called termination. Intuitively, a recursive system terminates when its initial procedure terminates, i.e., the stack of activation records becomes empty. In pOC, termination corresponds to decreasing the counter to zero (this means that the set of zero rules is irrelevant, and therefore we restrict ourselves to zero-trivial pOC in the context of problems related to termination). Technically, we distinguish between two forms of termination.
- Non-selective termination, where the target set T consists of all configurations with empty stack (or zero counter).
- Selective termination, where T consists of some (selected) configurations with empty stack (or zero counter).
For pBPA, there is only one terminated configuration e, and the two notions of termination coincide. For pPDA and pOC, selective termination intuitively corresponds to terminating with one of the distinguished output values.
The main algorithmic problem related to reachability is computing the probability £P(Reach(s,T)) for given s and T. For finite-state Markov chains with rational transition probabilities, £P(Reach(s,T)) is always rational and can be computed in polynomial time by solving a certain system of linear equations2 (see, e.g., [5]). For infinite-state Markov chains generated by pPDA, we only consider regular sets T of target configurations encoded by the associated DFA (see Definition 5). Still, £P(Reach(s,T)) can be irrational and cannot be computed precisely in general. Hence, in this setting we refine the task of computing £P(Reach(s, T)) into the following problems:
- Qualitative reachability/termination. Given s and T, do we have that ,0>{Reach{s,T)) = 1?
- Quantitative reachability/termination. Given s, T, and a rational constant p G (0,1), do we have that 0?(Reaches, T)) < pi
- Approximating the probability of reachability/termination. Given s, T, and a rational constant 8 > 0, compute £P(Reach(s, T)) up to the absolute/relative error 8.
Expected termination time and total accumulated reward. Similarly as in the previous paragraph, let us fix an initial state s G S and a set of target vertices T C.S. Further, we fix a reward function f : S —>
We are interested in the expected total reward accumulated along a path from s to T. Formally, for every run w, let Hit(w) be the least j G such that w(j) G T. We define a random variable Acc : Run(s) —>• where
Hit(w)-\
Acc{w) = £ /(w(/))
i=0
2 For every t G S, we fix a fresh variable Y,. If t G T, we put Y, = 1. If t cannot reach T at all, we put Yt = 0. Otherwise, we put Yt = E ^ , x-Yti. The resulting system of linear equations has only one solution
inRlsl which is the tuple of all &>(Reach(t,T)).
10
Tomáš Brázdil et al.
Note that if Hit(w) — 0, then the above sum is empty and hence Acc(w) — 0.
The task is to compute the expected value K(Acc). If 0 < £P(Reach(s, T)) < 1, we are also interested in the conditional expectation K(Acc \ Reach(s, T)). Note that if f(r) — 1 for all r e S, then K(Acc) corresponds to the expected number of transitions (or "time") needed to visit T from s.
For finite-state Markov chains, both K(Acc) and K(Acc \ Reach(s,T)) are easily computable in polynomial time. For pPDA and its subclasses, we face the same difficulties as in the case of reachability/termination, and also some new ones. In particular, K(Acc) can be infinite even if £P(Reach(s, T)) — 1 and / is bounded, which cannot happen for finite-state Markov chains. We also need to restrict the reward functions to some workable subclass. In particular, we consider reward functions that are
- constant, which suffices for modelling the discrete time;
- simple, i.e., for every configuration pXa with non-empty stack, f(pXa) depends just on the head pX (there are no restrictions on f(pe)). Simple reward functions are sufficient for modelling the costs and payoffs determined just by the currently running procedure;
- linear, i.e., for every configuration pa we have that
Here c,d are functions that assign a fixed non-negative value to every control state, and h does the same for stack symbols. Further, #x(a) denotes the number of occurrences of X in a. Note that by putting c(p) — 0, we can assign zero reward to all configurations of the form pa.
Linear reward functions are useful for, e.g., analyzing the stack height or the total amount of memory allocated by all procedures currently stored in the stack of activation records.
Similarly to reachability/termination, we refine the problem of computing K(Acc) and K(Acc \ Reach(s, T)) for pPDA and its subclasses into several questions.
- Finiteness. Given s, T, and /, do we have K(Acc) < °°? Do we have E(Acc\Reach(s,T)) < °°?
- Boundedness. Given s, T, f, and c e M+, do we have K(Acc) < cl Do we have E(Acc\Reach(s,T)) < c?
- Approximating the expected accumulated reward. Given s, T, f, and 8 > 0, compute K(Acc) and K(Acc \ Reach(s, T)) up to the absolute/relative error 8.
Apart from these algorithmic problems, we are also interested in the distribution of Acc, particularly in the special case when Acc models the termination time. Then £P(Acc < c) is the probability that the initial configuration terminates after at most c transitions, and we ask how quickly this probability approaches the probability of termination as c increases. Thus, we obtain rather generic results about the asymptotic behaviour of recursive probabilistic systems (see Theorem 11).
Probabilistic Pushdown Automata
11
Mean Payoff and Other Long-Run Average Properties. The mean payoff is the long-run average of reward per visited vertex. Formally, we fix some reward function / : S —>• M-°, and define random variables MPsup,MPinf: Run(s) —>• as follows:
MPsup(w) MPinf(w)
For finite-state Markov chains, we have that £P (MPsup ^ MPinf) — 0, although there may be uncountably many w g Run(s) such that MPsup(w) ^ MPinf (w). Further, both MP sup and MPinf assume only finitely many values v\,..., v„ with positive probabilities pi,...,pn, and these values and probabilities are computable in polynomial time by analyzing the bottom strongly connected components of the finite-state Markov chain. Hence, the expected values of MPsup and MPinf are the same and efficiently computable.
For pPDA and its subclasses, we consider the same classes of reward functions as in the case of total accumulated reward. First, we need to answer the question whether the mean payoff is well-defined for almost all runs, i.e., whether 3?(MPsup ^ MPinf) — 0, and what is the distribution of MPsup and MPinf. The algorithmic problems concern mainly approximating the expected value of MPsup and MPinf, and approximating the distribution of these variables.
Model-Checking Linear-Time and Branching-Time Logics. Let s be a vertex of M. A linear-time specification is a property
2 | ^*~p
i^q>2
Here a ranges over a countably infinite set Ap of atomic propositions, p g [0,1] is a rational constant, and ^g {<,<,>,>,=}■
limsup^LoZWO)
n^oo n + 1
— liminf
1
12
Tomáš Brázdil et al.
The logic PCTL is a fragment of PCTL* where path formulae are given by the equation
| i^ 2- The qualitative fragments of PCTL and PCTL*, denoted by qPCTL and qPCTL*, resp., are obtained by restricting the allowed operator/number combinations in £P~P(p subformulae to '=0' and 1' (we do not include '<1', '>0' because these are definable from '=0', '=1', and negation). Finally, a path formula
i A 2 iff s K #1 and í p ^2,
s K ^~p<ř> iff ^*({we.Rw«0) 1 w|=vl^<ř>2 iff there is j>0 s.t. Wj |=v 92 and w, ^v 0. We start with general pPDA and then show that some of the considered questions are solvable more efficiently for the pBPA and pOC subclasses.
3.1 Quantitative and qualitative reachability
Recall that the quantitative/qualitative reachability problems are the questions whether £P(Reach(s, T)) is bounded by a given rational p e (0,1) or equal to one, respectively.
Results for pPDA. First, we show how to solve the reachability problems for a simple set of target configurations. A generalization to arbitrary regular sets is then obtained by employing the generic construction recalled in Remark 1.
Let A — (Q,r, <->• ,Prob) be a pPDA and ^CgxTa simple set of configurations where is the associated set of heads (see Definition 5). For all p,q e Q and XeT, let
Probabilistic Pushdown Automata
13
- [pX»] denote the probability £P(Reach(pX,^));
- [pXq] denote the probability of all w g Reach(pX,{qe}) such that w does not visit a configuration of %'.
One can easily verify that all such \pX»] and [pXq] must satisfy the following:
- For all pX g and q g Q, we have that \pXm] = 1 and [pXq] — 0.
- For all pX ^ Jtf? and q g Q, we have that
[pX.] = £ x.[ry.] + £ x-[rY*] + £ £ x• [rYt] ■ [tZ.]
pX^rY pX^rYZ PX^rYZt£Q
\pXq]= £ * + £ x-[ry9] + £ £x.[r7f].[fZ?]
pX^tqE pX^trY pX^trYZ tf=Q
Now consider a system Reach^ of recursive equations obtained by constructing the above equalities for all [pX»] and [pXq] where p,q g Q and XeT, and replacing each occurrence of all \pX»] and [pXq] with the corresponding fresh variables (pX») and (pXq). In general, Reach^ may have several solutions. It has been observed in [25,31] that the tuple of all [pX»] and [pXq] is exactly the least solution of Reach^ in ([0,1]*, C), where k — |F| • (|g|2 + \Q\) and C is the component-wise ordering. Observe that if = 0, then [pX»] = 0 and [pX#] = ^(Reach(pX, {qe})) for all p,q e Q and X g F. Hence, in the case of termination, it suffices to put %' — 0 and consider a simpler system of equations Term& which is the same as Reach& but all (pXu) variables and the corresponding equations are eliminated.
Example 2 Consider again the pBPA A of Example 1, and let %' — 0. The system Term& looks as follows:
=£ + \-{A)-{I) (A) = 1.(1).(I)
Hence, [/] is the least solution of = ! + -^(I)3, which is (the golden ratio).
In general, the least solution of Reach^ cannot be given as a tuple of closed-form expressions. Still, we can decide if [pX»] < p for a given rational p g (0,1) by encoding this question in Tarski algebra (see Section 2.2). More precisely, we construct a formula
then also the least solution does. Since the formula
p in polynomial space by constructing a universal formula
"for every V g [0,1]* such that V is a solution of Reach a we have \{pXm) > P'■
The same observations are valid also for the probability [pXg], and be can be further extended to regular sets of target configurations (see Remark 1). Thus, we obtain the following:
14
Tomáš Brázdil et al.
Theorem 1 (see [25,31]) Let A = (Q,r,^,Prob) be a pPDA, pa e Q x F* a configuration of A, and c€ a regular set of configurations of A represented by a DFA £$'. Further, let p g (0,1) be a rational constant. Then the problems whether ^(Reach(pa,^)) = 1 and ^(Reach(pa,^)) < p are in PSPACE.
Interestingly, there are no lower complexity bounds known for the problems considered in Theorem 1. On the other hand, there is some indication that improving the presented PSPACE upper bound to some natural Boolean complexity class (e.g., the polynomial hierarchy) might be difficult. As observed in [31], even the problem whether £?>(Reach(pX, {qe})) — 1 is at least as hard as Square-Root-Sum, whose exact complexity is a long-standing open problem in computational geometry3.
As we shall see in Sections 4, 5, and 6, the tuple of termination probabilities, i.e., the least solution of Term^, is useful for computing and approximating other interesting quantities. Since Term^ can have several solutions, it is not immediately clear that the tuple of all termination probabilities [pXq] is efficiently expressible in the existential fragment of Tarski algebra in the sense of Definition 3. However, we can effectively extend the system Term a by the following constraints:
(1) 0 < (pXq) < 1 for every (pXq);
(2) (pXq) — 0 for every (pXq) such that [pXq] — 0;
(3) £q€Q (pXq) = 1 for every pXeQxT such that £?eG [pXq] = 1;
(4) LqeQ (pXq) < 1 for every pXeQxT such that ^qeQ\pXq} < 1.
Note that the constraints (1) and (2) can be computed in time polynomial in \\A ||, and the constraints (3) and (4) can be computed in space polynomial in \\A || by Theorem 1. It has been shown in [29,30] that the system Term^ extended with the above constraints has a unique solution in Rk, where k — \Q\2 ■ |F|. Thus, we obtain the following:
Theorem 2 Let A — (Q,T, ,Prob) be a pPDA. The tuple of all termination probabilities [pXq] is expressible in the existential fragment of Tarski algebra. Moreover, the corresponding formula is constructible in space polynomial in \\A\\, and the length of is polynomial in ||41|.
Results for pBPA. For pBPA, the quantitative termination, i.e., the question whether ,^(Reach(X,{e})) < p for a given rational p e (0,1), is still as hard as square-Root-Sum [31]. However, it has been also shown in [31] that the qualitative termination, i.e., the question whether £P(Reach(X,{e})) — 1, is solvable in polynomial time. This is achieved by employing some results and tools of spectral theory. First, let us consider a pBPA A — (F, ^ ,Prob) such that
(1) for every XeTwe have that 3*>(Reach(X, {e})) > 0;
(2) for all X, Y e F there is a configuration of the form Ya reachable from X.
3 An instance of SQUARE-ROOT-SUM is a tuple of positive integers ai,...,an,b, and the question is whether £"=1 < b. The problem is obviously in PSPACE, because it can be encoded in the existential fragment of Tarski algebra (see Section 2.2), and the best upper bound currently known is CH (counting hierarchy; see Corollary 1.4 in [1]). It is not known whether this bound can be further lowered to some natural Boolean subclass of PSPACE, and a progress in answering this question might lead to breakthrough results in complexity theory.
Probabilistic Pushdown Automata
15
As a running example, we use a pBPA © with three stack symbols X, Y, Z, where
x 025^ yz Y xx z 025^ yx
X°^e, Y Z, Z°^XZ,
Z^e
Assumptions (1) and (2) imply that either all stack symbols of A terminate with probability one, or all of them terminate with a positive probability strictly less than one. Hence, we need to decide whether the least solution of Term a is equal to (1,..., 1) or strictly less than 1 in every component. To get some intuition, let us first realize that A can also be interpreted as a multi-type branching process (MTBP). Intuitively, the symbols of F then correspond to different "species" which can evolve into fi-
0 25
nite collections of other species or die. For example, X^^YZ means "one copy of X can evolve into one copy Y and one copy of Z with probability 0.25", while X^>e means "X dies with probability 0.75". The states of the MTBP determined by A are finite collections of species (stack symbols) where the ordering of symbols is irrelevant. Each occurrence of every stack symbol in the current state chooses a rule independently of the others, and the chosen rules are then executed simultaneously. The probability of this transition is obtained by multiplying the probabilities of the chosen rules. Thus, we obtain an infinite-state Markov chain Ba whose states are tuples of the form (X*1,... ,X*"), where {Xj,... ,Xn} — F and g Z-° for every 1 < i < n, and the transitions are defined in the way described above. For example, the set of transitions in B@ includes the following (for simplicity, in each tuple we omit all symbols with zero occurrence index):
(X,Y)°^(Z), (X,Y)°^ (X2,Y,Z), (X2)°^(Y,Z)
The first transition is determined by the rules X ^4 e and Y -^>Z, the second transition by the rules X^^tYZ and Y-^>XX, and the third transition by the rules X ^ YZ and X ^ e. Note that the probability of the last transition is 2 • 0.25 • 0.75 = 0.375 because both copies of X select their rules independently.
Intuitively, the only difference between Ma and Ba is that in Ma , the stack symbols are processed from left to right, while in Ba, all stack symbols are processed simultaneously. However, if the goal is to empty the stack, it does not really matter in what order we process the symbols because all of them have to disappear anyway. Formally, one can prove that for every a e F*, the probability of reaching e from a in Ma is the same as the probability of reaching the empty family from the family of symbols listed in a in Ba ■ In particular, we have that every symbol X e F terminates with probability 1 in Ma iff the family (Xj,... ,X„) reaches the empty family with probability 1 in Ba - Now consider an n x n matrix C where C(i,j) is the expected number of X/s obtained by performing a rule of X,. For example, for the pBPA © we thus obtain the matrix
l
4
0
1
4
Here the symbols X, Y, Z are formally treated as X\, X2, X3, respectively. Note that, e.g., C(3,1) — + l + j. The three summands correspond to the three
16
Tomáš Brázdil et al.
outgoing rules of Z, where for each rule we count the number of X's on its right-hand side.
It follows immediately that the z'-th component of the vector (1,..., 1) C is the expected number of occurrences of X,- in a one-step successor of (Xj,... ,X„) in . In general, one can easily verify that the z'-th component of (1,..., 1) • Ck is the expected number of occurrences of X, in a £-step successor of (Xj,... ,X„). Intuitively, it is not surprising that (Xj,... ,X„) reaches the empty family with probability one iff the expected size of the family reached in k steps stays bounded as k increases. Indeed, denoting by Sk the size of the family in the k-Xh step, the Markov inequality implies that liminf^ocSi is almost surely finite. However, this means that almost every run must visit a particular family infinitely many times. As every symbol terminates with positive probability, almost all runs terminate.
Checking whether the expected size of the family stays bounded translates to checking whether the sum of all elements in Ck is bounded for all k > 1. Note that due to Perron-Frobenius theorem, the matrix C possesses a positive real dominant eigenvalue, say r (i.e., all other eigenvalues A satisfy | A | < r). The elements of Ck are bounded iff r < 1, and the latter condition can be checked in polynomial time [32].
For the pBPA ©, the largest eigenvalue of C is strictly less than 0.9, hence (X,Y,Z) reaches the empty family with probability one, and hence all of the symbols X, Y, Z terminate with probability one in M@.
If a given pBPA A does not satisfy the assumptions (1) and (2), we first determine and eliminate all stack symbols that cannot reach e (this is easily achievable in polynomial time). Then, the variables of Term& are ordered according to the associated dependency relation, and the resulting dependency graph is split into strongly connected components that are processed in the bottom-up direction, using the above described method as a procedure for resolving the most complicated subcase in the underlying case analysis. Thus, every stack symbol is eventually classified as terminating with probability 0, 1, or strictly in between. We refer to [32] for details.
The result about termination has been extended to general qualitative reachability in [13], even for a more general model of Markov decision processes generated by BPA. Thus, we obtain the following:
Theorem 3 Let A — (F, ^ ,Prob) be a pBPA, a e F* a configuration of A, and %' a regular set of configurations of A represented by a DFA si'. The problem whether S^(Reach(a,c€)) = 1 is in P.
Results for pOC. Currently known results about the quantitative reachability/termination for pOC are essentially the same as for pPDA. However, the qualitative termination, i.e., the question whether £P(Reach(p(k),T)) — 1, where T is a subset of configurations with zero counter, is solvable in polynomial time. The underlying method is based on analyzing the trend, i.e., the long-run average change in counter value per transition. This is an important proof concept which turned out to be useful also in the more general setting of MDPs and stochastic games over one-counter automata (see, e.g., [11,9,10]). Therefore, we explain the main idea in greater detail.
Probabilistic Pushdown Automata
17
I.+l
, 5,0
t$ *■» Q !£$ *.° £
1 2 0 ^,0 ■ 2 ' 1 2
Q c2
Fig. 2 A chain F4 and its bottom strongly connected components.
Let A — (Q,8 ,8 ,P ,P ) be a zero-trivial pOC. We may safely assume that for all p,q E Q there is at most one rule (p,c, q) E 8>0. The behaviour of A for positive counter values is then fully captured by the associated finite-state Markov chain Fa , where the set of vertices is Q and each transition is assigned, in addition to its probability, a weight, which encodes the corresponding change in the counter value. More precisely, p q is a transition of with probability x and weight c iff (p,c,q) G c)>0 and P>0(p,c,q) — x. An example of F^ is shown in Fig. 2 (note that the underlying pOC A has ten control states).
Now consider an initial configuration p(k) of A, and let T be the set of all configurations with zero counter (the selective case, when T Cgx {0}, is discussed later). Our aim is to decide whether £P(Reach(p(k), T)) — 1. Obviously, almost every run w G Run(p(k)) which does not visit T must visit a bottom strongly connected component (BSCC) C of Fa - For each such C we can easily compute the trend tc which corresponds to the long-run average change in the counter value per transition (in other words, tc is the mean payoff determined by transition weights). More precisely, for every q EC we first compute
change — ) x-c
which is the expected change of counter value caused by an outgoing transition of q. Then, we take the weighted sum of all changeq according to the unique invariant distribution Tic for C (intuitively, Ttc{q) gives the "frequency" of visits to q along a run initiated in (some) state of C; see, e.g., [21]). Hence,
tc = Xc(q)-changeq.
q£C
For the BSCCs C\, C2, C3, and C4 of Fig. 2 we obtain the trends 0, 0, \, and -\, respectively (note that the invariant distribution is uniform for each of these BSCCs). Now we distinguish three possibilities:
- If tc < 0, then for every configuration q(£) where q EC we have that q(£) terminates with probability 1. Intuitively, this is because the counter tends to decrease on average, and hence it is eventually decreased to zero with probability 1.
18
Tomáš Brázdil et al.
- If tc > 0, one might be tempted to conclude that £P(Reach(q(£),T)) < 1 for every q E C and £ > 1. The intuition is basically correct, but for some small £, the configuration q(£) may still terminate with probability one, because the initial transitions of q(£) may only decrease the counter even if the overall trend is positive. For example, consider the configuration g( 1) in the underlying pOC of Fig. 2. Although g G C3 and the trend of C3 is positive, g(l) terminates with probability one. In general, one can show that if q(£) can reach a configuration with an arbitrarily high counter value without a prior visit to a configuration with zero counter, then q(£) terminates with probability strictly less than one; otherwise, q(£) terminates with probability one. This condition can be checked in polynomial time by the standard reachability analysis (see, e.g., [24]), and for every q EC we can easily compute a bound k G N such that £P(Reach(q(£) ,T)) < 1 iff £ > k. For example, for the state g of Fig. 2 we have that k — 2.
- If tc — 0, then for every q(£) where q EC we have that q(£) terminates with probability one iff q(£) can reach a configuration with zero counter. Again, this condition is easy to check in polynomial time, and for each q we can easily compute a bound k E such that &'(Reach(q(£),T)) — 1 iff £ < k. For example, for the states c and e of Fig. 2 we have that the k is equal to 1 and °°, respectively. Intuitively, the above condition captures the difference between two possible "types" of BSCCs with zero trend, which can be informally described as follows:
- In Type I case, the counter can be changed by an unbounded amount along a run in C (a concrete example is the component C2 of Fig. 2). Then, given qEC, the expected accumulated counter change between two visits of q in F^ is zero. At the same time, the accumulated change is negative with some positive probability. Thus, by standard results of theory of random walks (see, e.g., [21]), for every run w of F^ we have that the counter change accumulated along w fluctuates among arbitrarily large positive and negative values. However, then the corresponding run of initiated in a configuration q(£) eventually terminates.
- In Type II case, the counter change along every run in C is bounded. A concrete example is the component C\ of Fig. 2, where the counter is not changed at all. Then, a run initiated in q(£) terminates either with probability one or zero, depending on whether £ is small enough or not, respectively.
Using the above observations, we can determine if a configuration q(£), where q belongs to some BSCC of F^, terminates with probability one or not. If q does not belong to a BSCC of F^, we simply check whether q(£) can reach a configuration q'(£'), such that q' belongs to some BSCC and q'(£') terminates with probability less than one. If so, then q(£) terminates with probability less than one, otherwise it terminates with probability one.
For the selective termination, when T C Q x {0}, we have that ,^>(Reach(q(£),T)) = 1 iff ,^>(Reach(q(£),Q x {0})) = 1 and q(£) cannot reach a configuration r(0) ^ T. Thus, we obtain the following:
Theorem 4 Let A = (Q, 8=°, 8>Q,P=Q,P>0) be a zero-trivial pOC, q(£) a configuration of A, and T C Qx{0} a set of target configurations. The problem whether &'(Reach(q(£), T)) — 1 is in P, assuming that £ is encoded in unary.
Probabilistic Pushdown Automata
19
3.2 Approximation results for reachability and termination
Now we consider the problem of approximating £P(Reach(s,T)) up to the given absolute/relative error 8 > 0.
Note that the results of Theorem 1 can be used to compute £P(Reach(pa,^)) up to the given absolute error 8 > 0 in polynomial space by a simple binary search. However, since this algorithm uses a decision procedure for the existential fragment of Tarski algebra, it is not really practical. Observe that we can view the system Reach^ introduced in Section 3.1 more abstractly as a system of polynomial equations of the form yi — Pok(y\,y„), where n e N, 1 < i 0 a rational
20
Tomáš Brázdil et al.
constant represented as a fraction of binary numbers. Then there isreQ computable in polynomial time such that \ &\Reach(a ,c^)) — r\<8.
For pOC, it is not known whether the termination probabilities can be approximated in polynomial time on the standard Turing machine model. So, we can only state a somewhat weaker result.
Theorem 6 Let A — (Q,8=0,8>0,P=0,P>0) be a pOC, p(k) a configuration of A (where k is encoded in unary), q G Q, and 8 > 0 a rational constant. Then there is r G Q computable in polynomial time on the unit-cost rational arithmetic RAM such that\0>(Reach(p(k),{q(O)}))-r\ < 8.
Approximating £P(Reach(pa,^)) up to a given relative error 8 > 0 is more problematic. It requires exponential time even for pBPA and termination, because the probability £P(Reach(X, {e})) can be doubly-exponentially small in the size of the underlying pBPA A. To see this, realize that pBPA can simulate repeated squaring; let A — ({Xi,...,Xn+i, Z}, <->•, Prob) where, for all 1 0 in polynomial time on the unit-cost rational arithmetic RAM.
4 Translating pPDA into pBPA
In this section we present the construction of [17] which transforms every pPDA into an equivalent pBPA where all stack symbols terminate either with probability 0 or 1. This transformation preserves virtually all interesting quantitative properties of the original pPDA (except, of course, termination probabilities) and it is in some sense effective. Thus, the study of general pPDA can be reduced to the study of a special type of pBPA, and the reduction step does not lead to any substantial increase in complexity (at least, for the problems considered in this survey).
For the rest of this section, we fix a pPDA A — (Q,T, ^ ,Prob). For all p,qeQ and X G F, we use
- Run(pXq) to denote the set of all runs in initiated in pX that visit qe;
- Run(pX^) to denote the set of all runs in initiated in pX that do not visit a configuration with empty stack.
The probability of Run(pXq) and Run(pX^) is denoted by [pXq] and [pX^], respectively.
The idea behind the transformation of A into an equivalent pBPA A. is relatively simple and closely resembles the standard method for transforming a PDA into an equivalent context-free grammar (see, e.g., [36]). Formally, the stack alphabet F. of A. is defined as follows:
Probabilistic Pushdown Automata
21
- For all p G Q and X G F such that [pXt] > 0 we add a stack symbol (pXt) to F..
- For all p,q e Q and X G F such that [pXg] > 0 we add a stack symbol (pXq)
tor..
Note that F. is effectively constructive in polynomial space by applying the results of Section 3. Now we construct the rules of A. together with their probabilities. For all (pXq) G F. we do the following:
- if pX ^> rYZ, then for all s G Q such that y — x ■ [rYs] ■ [sZq] > 0 we put (pXq) (rYs)(sZq);
- if pX ^> rY where y — x- [rYq] > 0, we put (pXq) q\. (rYq);
- if pX^qe, we put (pXq) 0 we put
(PXt)^.(rYs}(sZty,
- for all q G Q and Y G F such that y = 1] ■ ^pX^qYpz > 0 we Put
{PXt)^%.{qYt). Note that the transition probabilities of A. may take irrational values, but are effectively expressible in the existential fragment of Tarski algebra (see Theorem 2). Obviously, all symbols of the form (pX"l) terminate with probability 0, and we show that all symbols of the form (pXq) terminate with probability 1 (see Theorem 7).
Remark 2 The translation from A to A. makes also good sense when A is a pBPA. Since qualitative termination for pBPA is decidable in polynomial time (see Theorem 3), one can also efficiently compute the set of all stack symbols Y of A such that [71] > 0, and hence the set of rules of A. is constructible in polynomial time (the rule probabilities may still take irrational values). Consequently, some interesting qualitative properties of pBPA are decidable in polynomial time, because they do not depend on the exact values of rule probabilities in A. (see Section 7.1).
Example 3 Consider a pPDA A with two control states p, q, one stack symbol X, and the following transition rules, where a > 1/2:
pX A 1/2, the resulting pBPA has a tendency to decrease the stack height. Hence, both (pXq) and (qXp) terminate with probability 1.
22
Tomáš Brázdil et al.
Every run of Ma initiated in pX that reaches qe can be "mimicked" by the associated run of Ma. initiated in (pXq). Similarly, almost every4 run of Ma initiated in pX that does not visit a configuration with empty stack corresponds to some run of MA. initiated in (pXt).
Example 4 Let A be a pPDA with two control states p,q, one stack symbol X, and the following transition rules:
pX'^pXX, pX'^qE, qX^qE.
Then [pXq] — 1 and [qXq] — 1, which means that A. has just two stack symbols (pXq) and (qXq) and the rules
(pXq) ^4. (pXq) (qXq), (pXq) £, (qXq) ^. E.
The infinite run pX, pXX, pXXX,... cannot be mimicked in Ma., but since the total probability of all infinite runs initiated in pX is zero, almost all (but not all) of them can be mimicked in Ma..
The correspondence between the runs of Ma and Ma. is formally captured by a finite family of functions (-)q where 0 G QU {"[}. For every run w G Run(pX) in Ma, the function (-)q returns an infinite sequence wq such that wq(i) G F.* U {x} for every i G Z-°. The sequence wq is either a run of Ma. initiated in (pXQ), or an invalid sequence. As we shall see, all invalid sequences have an infinite suffix of " x " symbols and correspond to those runs of Run(pX) that cannot be mimicked by a run of Run((pXQ}).
So, let 0 e Q U {t}, and let w be a run of Ma initiated in pX. We define an infinite sequence w0 over r* U {x } inductively as follows:
- w0(O) is either (pXQ) or x, depending on whether (pXQ) G F. or not, respectively.
- If wq(i) — x or wq(i) — £, then wo(i+l) — wq(i). Otherwise, we have that wQ(i) — (pX-f)a, where f G Q U {t}, and w(i) — pXj for some y G F*. Let pX^r/5 be the rule of A used to derive the transition w(i) —>-w(?+l). We put
a if j8 — E and f — r;
(r7f )a if j8 = 7 and [r7f ] > 0;
(r7s) (iZf) a if j8 = 7Z, [^Zf ] > 0, and there is k > i such that
w0(/+l) — ^ w(k) — sZymd \w(j)\ > \w(i)\ for all i< j < k;
(r7t)a if )3 = 7Z, f = t Kt] > 0, and |w(;')| > \w(i)\
for all j > /;
x otherwise.
4 Here "almost every" is meant in the usual probabilistic sense, i.e., the probability of the remaining runs is zero.
Probabilistic Pushdown Automata
23
We say that w G Run(pX) is invalid if w©(/) — x for some i G Z-°. Otherwise, w is vaZ/ii. It is easy to check that if w is valid, then w© G Run((pX(•)))■ Hence, (■)© can be seen as a partial function from Run(pX) to Run((pXQ)) which is defined only for valid runs. Further, for every valid w G Run(pX) and every i G Z-° we have that
- w(i) — rYfi iff w©(/) = (rKf )yfor some f G gU {t} and 7G F.*,
- w(i) — re iff Wq(i) — e and 0 — r.
Hence, (•)© preserves all properties of runs that depend just on the heads of visited configurations. Further, (■)© preserves the probability of measurable subsets of Run(pX) with respect to the associated probability measure £PQ. More precisely, we define the probability space (Run(pX),&, £Pq), where & is the standard Borel a-field generated by all basic cylinders (see Section 1) and &q is the unique probability function such that for every w G FPath(pX) we have that
_ ^>(Run(w)nRun(pXQ))
where & is the standard probability function. Now we can state the main theorem, which says that (■)© is a probability preserving measurable function.
Theorem 7 (see [17]) Let A = (Q,r, ^,Prob) be a pPDA, p G Q, X G r, and 0 G r U {t} such that \pX<3] > 0. Then for every measurable subset R C Run((pXQ}) we have that (O©1^) — Run{pX) is measurable and &{R) — ^©((O©1 (R))-
In particular, Theorem 7 implies that all symbols of the form (pXq) which belong to r. terminate with probability one, because
&>(Reach((pXq),{e})) = &>g((-)-l(Reach((pXq),{e})) = &>g(Run(pXq)) = 1. 5 Termination Time and Total Accumulated Reward
Now we show how to compute the (conditional) expected total reward accumulated along a run initiated in a given configuration before visiting a target configuration from a given regular set. We also formulate generic tail bounds on the termination time in pPDA. This section is based mainly on the results presented in [26,17,15].
5.1 Computing and Approximating the Expected Total Accumulated Reward
Results for pPDA and pBPA. Let us fix a pPDA A — (Q,T, ^,Prob) and a simple reward function / : Q x F* —>• R-° (recall that a reward function is simple if f(pXa) — f(pXp) for all pX G Q x r and all a, j8 G r*). As in the previous sections, for all p, q G Q and X G F we use Run(pXq) to denote the set Reach(pX, {ge}), and [pXq] to denote the probability of Run(pXq). If [pXq] > 0, then we also use Epxq to denote the conditional expectation
E(Acc I Run(pXq))
24
Tomáš Brázdil et al.
where Acc is the random variable introduced in Section 2.4 and the set of target configurations contains just qe. That is, Epxq is the conditional expected total reward accumulated along a run initiated in pX before visiting qe under the condition that qe is eventually visited, where / is the underlying reward function.
We show that the tuple of all Epxq, where [pXq] > 0, is the least solution of an effectively constructible system of linear equations ExpectA, where the (fractions of) termination probabilities are used as coefficients. Since Epxq can also be infinite, we need to consider the least solution of ExpectA in ((Mi0)*, C), where k is the number of all triples (p,X,q) such that [pXq] > 0, and C is the component-wise ordering.
The system ExpectA is obtained as follows. First, we compute the set of all triples (p,X,q) such that [pXq] > 0 (this can be done in polynomial time). For each such (p,X, q) we fix a fresh variable (Epxq) and construct the equation given below, where all summands with zero coefficients are immediately removed.
{EpXq) = f(PX)+ £ X-^.{ErYq)+ £ LX'[rSiZq]-((ErYl) + (ElZq)) (1)
Example 5 Let us consider a pBPA A with just one stack symbol / and the rules
/ A //, / ^ e
Further, let /(/) — 1. The system ExpectA then contains just the following equation:
<£,) = 1+*- [/]•((£/> + (£/»
If x — |, then [/] — j and hence Ej — 3. If x — \, then [/] — 1 and the only solution to the above equation is °°.
In general, ExpectA may have several solutions in ((R^0)*, C). However, if we identify and remove all variables which are equal to 0 or °° in the least solution, then the resulting system Expect'A has exactly one solution in ((M-0)*, C). To see this, let us assume that Expect'A has another solution v apart from the least solution jj. (note that v — fi > 0). Since we eliminated all zero variables, jj. is strictly positive in all components and hence there is c > 0 such that (c,...,c) C jj.. Let d be the maximal component of v — fi. Then d > 0, and jj. — ^ (v — fi) is also a non-negative solution of Expect'A which is strictly smaller than fi, and we have a contradiction.
Note that one can easily identify all (Epxq) such that Epxq — 0. This is because Epxq — 0 iff every w e Run(pXq) visits only configurations with zero reward (except for the last configuration qe), and this is a simple reachability question which can be solved in polynomial time by standards methods [24]. However, identifying the variables (Epxq) such that Epxq — °° is not so trivial because the coefficients of ExpectA are given only symbolically and their actual values are not at our disposal. Still, one can easily express the question whether Epxq — °° in the existential fragment of Tarski algebra, and hence the system Expect'A is constructible in polynomial space. This implies that the tuple of all Epxq is effectively expressible in the existential fragment of Tarski algebra. The corresponding formula (cf. Definition 3) can be constructed in polynomial space and its size is polynomial in the size of A.
Probabilistic Pushdown Automata
25
Remark 3 It is worth mentioning that in the special case when A is a pBPA such that all stack symbols terminate with probability one, the system Expec^ and its only solution are computable in polynomial time. Here, the only problem is to identify all stack symbols X such that Ex — °°, which can be done by constructing the dependency graph among the stack symbols (we say that X depends on Y if X can reach a configuration of the form Ya), identifying strongly connected components (SCCs) in this graph, and processing them in the bottom-up direction. Note that if Ey — °°, then Ex — °° for all X that depend on Y. For a bottom SCC C, we simply consider a pBPA Ac obtained from A by restricting the stack alphabet to C, and check whether Expect A(, has a solution in non-negative reals. If so, Ex < °° for all X EC, otherwise Ex — °° f°r all X EC. A similar procedure is used when processing the intermediate SCCs. So, we eventually decide whether Ex — °° for every X E T.
The above observations can be immediately extended to the conditional expectation of the form
where %' is a simple set of target configurations (see Definition 5). This is because we can modify a given pPDA A into another pPDA A' by
- adding a fresh control state t where tX ^> t£ for every stack symbol X of A;
- modifying the rules of A so that the only successor of every qY/5 E %' is the configuration tY/5;
- extending the reward function / by stipulating f(tX) — 0 for every stack symbol
It follows immediately that K(Acc | Reach(pX,^)) computed for A is equal to K(Acc | Reach(pX, {?£})) computed for A', and thus we can apply the results mentioned above. Further, we can generalize this observation to regular sets of target configurations by using the generic method of Remark 1. Thus, we obtain the following theorem:
Theorem 8 (see [26]) Let A — (Q,T, ^ ,Prob) be a pPDA, pa E QxT* a configuration of A, c 0, and f a simple reward function. Further, let p > 0 and 8 > 0 be rational constants. Then the problems whether E(Acc \ Reach(pa ,c/o)) < °° andK(Acc \ Reach(pa,c/o)) < p are inPSPACE. Further, there is r G Q computable in polynomial space such that \E(Acc \ Reach(pa,c/o)) — r \ < 8.
In the special case when A is a pBPA where all stack symbols terminate with probability one, the conditional expectation E(Acc \ Reach(a, {e})) is computable in polynomial time.
Theorem 8 can be easily extended to a more general class of linear reward functions. Recall that a reward function / is linear if there are functions c,d : K-° and h-.T ->• R-° such that for all pa G Q x r* we have that
E(Acc | Reach(pX
X of A.
26
Tomáš Brázdil et al.
where #x(oc) denotes the number of occurrences of X in a. Again, we start by considering the conditional expectation
E(Acc\Run(pXq)).
Intuitively, the main difference from the case when / was simple is that after executing a rule of the form pX qYZ, the symbol Z contributes to the total accumulated reward even if it is hidden in the middle of the stack. Using the results about simple reward functions, we can express the expected number of visits to a configuration with a given control state r along a path from pX to qe. Thus, we can also express the "expected contribution" of Z to the total reward accumulated along such a path. These considerations lead to a system of equations similar to ExpectA (we refer to [26] for details). Hence, Theorem 8 holds also for linear reward functions without any change.
Let us note that Theorem 8 can be generalized even further; it holds for an arbitrary (fixed) conditional moment
E(Accl \Reach{pa,c6J)).
In particular, one can approximate the conditional variance of Acc up to an arbitrarily small absolute error e > 0 in polynomial space [26].
Results for pOC. In this paragraph we present the results of [15] about the conditional expected termination time in zero-trivial pOC.
Let us fix a zero-trivial pOC A — (Q,8=Q ,8>Q ,P=Q ,P>0). Similarly as in Section 3.1, we assume that for all p,q G Q there is at most one rule (p,c,q) G 8>0. Since we are primarily interested in the (conditional) expected termination time, we fix a constant reward function / which returns 1 for every configuration. Consistently with the notation previously adopted for pPDA, we use Run(plq) to denote the set Reach(p(l),{q(0)}), and [plq] to denote the probability of Run(plq). For every i G N, we also use Run(plq, i) to denote the set of all runs initiated in p(l) that visit q(0) for the first time in exactly i transitions, and [plq, i] to denote the probability of Run(plq, i). If [plq] > 0, then Ep^q denotes the conditional expectation
E(Acc | Run(plq)).
where q(0) is the only target configuration. Also recall the finite-state Markov chain which captures the behaviour of A for positive counter values, and the definitions of the expected counter change at q G Q, denoted by changeq, and the trend of a given bottom strongly connected component (BSCC) C of denoted by tc (see Section 3.1). For every configuration p(k) of A, we use
- pre* (p(k)) to denote the set of all configurations that can reach p(k) in M^;
- post* (p(k)) to denote the set of all configurations reachable from p(k) in Ma ■
Our aim is to show that the problem whether Ep^q < °° is decidable in polynomial time, and that the value of Ep^q (if it is finite) can be efficiently approximated. A crucial step towards these results is the following theorem:
Probabilistic Pushdown Automata
27
Theorem 9 (see [15]) Let A = (Q, S=°, 8>0 ,P=0 ,P>0) be a zero-trivial pOC, and let p,q G Q such that [plq] > 0. Further, let xm,„ denote the smallest (positive) probability in FA.
(A) If q is not in a BSCC ofFA, then Ep^q < 5\Q\ / *J^J,+'G' •
(B) Let q G C, where C is a BSCC of FA. Further, let Conf — pre*(q(0)) H post*(p(l))nCxK Then
(a) if Conf is a finite set, then Ep^q < 20\Q\3/x^ ;
(b) if Conf is an infinite set, then
(1) ifC has trend t^ 0, then Eplq < 85000|e|6/(-*^l+lG'3 -t4);
(2) if C has trend t — 0, then Ep^q is infinite.
According to Theorem 9, the value of Ep^q is either infinite or at most exponential in the size of A. Note that this does not hold for pBPA, where the value of Ex can be doubly exponential in the size of the underlying pBPA (an example is easy to construct by simulating repeated squaring similarly as in Section 3.2).
It follows from the results of [24] that the sets pre* (q(0)) and post* (p(l)) are regular and the associated DFA are computable in polynomial time. Hence, the finiteness of Conf can be decided in polynomial time, and thus we obtain the following corollary to Theorem 9:
Corollary 1 Let p,q G Q such that [plq] > 0. The problem whether Ep^ < °° is in P.
The bounds given in Theorem 9 also provide the missing piece of knowledge needed for efficient approximation of the expected termination time in pOC. Recall that the tuple of all Ep^q, where [plq] > 0, is the least solution of the system of linear equations ExpectA. Due to Corollary 1, we can eliminate all variables (plq) such that Epiq — 0 or Ep^q — °° in polynomial time, and thus construct the system Expect'A. We have already shown that Expect1^ has only one solution. Also recall that the coefficients of Expect1^ are fractions of termination probabilities, which can be computed up to an arbitrarily small positive error in polynomial time, assuming the unit-cost rational arithmetic RAM model of computation (see Theorem 6). Using the bounds of Theorem 9, for each 8 > 0 we can give e > 0 (as a function of 8) such that solving a perturbed system Expect'A, where the coefficients are just approximated up to the absolute error e, produces a solution whose absolute error is bounded by 8. Further, the size of e (i.e., the length of the corresponding binary encoding) is polynomial in the size 8. Thus, we obtain the following:
Theorem 10 (see [15]) Let A = (Q, 8=0,8>Q,P=Q,P>0) be a zero-trivial pOC, and let p,q G Q such that [plq] > 0 and Ep^ < °°. Further, let 8 > 0 be a rational constant. Then the value of Ep^q can be approximated up to the absolute error 8 in polynomial time, assuming the unit-cost rational arithmetic RAM model of computation.
In the rest of this subsection we sketch the main ideas behind the proof of Theorem 9. In particular, we indicate why Ep^q is infinite only in case (B.b.2). First assume case (A), i.e., q is not in a BSCC of FA. Then for all s(£) Epost* (p(l)), where I > \Q\,
28
Tomáš Brázdil et al.
we have that s(£) can reach a configuration outsidepre*(q(0)) in at most \Q\ transitions. It follows that the probability of performing a path from p( 1) to q(0) of length i decays exponentially in i, and hence
Eplq — > Í
[pig, i] i * \pW[
is finite. As an example, consider the states a and b of Fig. 2. A "long" path from a(l) to b(0) inevitably loops between the control states a and b. Since there is always a chance to enter some BSCC of the probability of executing a path of length i which loops between a and b decays exponentially in i.
Next assume case (B.a), i.e., Cis a BSCC and Conf is a finite set. It is easy to show that the expected time for a run in Run(plq) to reach C is finite. Once the run has reached C, it basically moves within a Markov chain on Conf. By assumption, Conf is finite (which implies, by a pumping argument, that \Conf \ < 3|g|3). Consequently, after the run has reached C, it reaches q(0) in finite expected time.
Case (B.b) requires new non-trivial techniques. For the sake of simplicity, from now on we assume that Q — C (the general case requires only slight modifications of the arguments presented below). We employ a generic observation which connects the study of pOC to martingale theory (recall the definitions and results of Section 2.1). Let us fix an initial configuration r(c) egxN. Our aim is to construct a suitable martingale over Run(r(c)). Let p^ and cW be random variables which to every run w G Run(r(c)) assign the control state and the counter value of the configuration w(i), respectively. Note that if the expected change of counter value changes was the same for every s G C, we would have changes — t where t is the trend of C, and we could define a martingale m^, raW,... simply by
(i) _ jS -i-t if cW > 1 for all 0 < j < i; jra^1) otherwise.
Since changes is generally not constant, we might try to compensate the difference among the individual control states by adding a constant "weight" vs to each s G C. That is, we aim at designing a martingale of the form
(■) i L + v (j) — i-t if cW > 1 for all0 < j < i;
otherwise
In [15], it is shown that there indeed exists a vector of suitable vs G M such that the above stochastic process becomes a martingale. Further, the difference between the
maximal and the minimal weight assigned to some state, denoted by diff, is bounded
Icl
by 2|C|/xJni^. Due to this result, powerful tools of martingale theory, such as Azuma's inequality and the optional stopping theorem (see Section 2.1) become applicable to pOC. In particular, we can resolve both case (B.b. 1) and case (B.b.2) of Theorem 10.
Let us first consider case (B.b.l) when t ^ 0. By applying Azuma's inequality to the above martingale m^, raW,... over Run(p(\)), we show that there are 0 < a < 1 and h G N such that for all i > h we have that [plq, i] < a1. This immediately implies
Probabilistic Pushdown Automata
29
that Ep\^ is finite, and the bound given in case (B.b.l) is obtained by analyzing the size of a and h.
Realize that for every w G Rim(plq, i) we have that
(m«-m(°))(w) = vq-vp-i-t.
Hence, [piq,i] < 0P(m^ — — vq — vp — i - t). A simple computation reveals that for a sufficiently large h G N and all i > h we have the following:
- If t < 0, then
\piq,i] < & (m®-m<® > (i/2) • (-f)) .
- If t > 0, then
\piq,i] < & (m®-m<® < (i/2) • (-r)) .
In each step, the martingale value changes by at most diff+t+l, where diff is defined above. Hence, by applying Azuma's inequality, we obtain the following for all t ^ 0 and i > h:
( (i/2)2t2 \
[piq>i] - &xp{-2i(diff+t+ir) - a'
Here a — exp (—t2 / %(diff + t+ l)2) < 1. Hence, Ep^q < °°, and the bound given in Theorem 9, case (B.b.2), is computed by means of the bounds on diff and h.
Finally, consider case (B.b.2), i.e., t — 0. We need to show that Ep^q — °°. Since pre*(q(0)) n post* (p(\)) n C x N is infinite, for an arbitrarily large teN there is a configuration r(k) G pre* (q(0)) P\post* (p(l)). We show that if k is sufficiently large, then the expected number of transitions needed to decrease the counter by some fixed constant b is infinite. This is achieved by analyzing the martingale ra(°),raW,... for r(k), but this time we use the optional stopping theorem (see Section 2.1) to show that the probability of performing a finite path of length i which decreases the counter by b decays sufficiently slowly to make the expected length of this path infinite. It follows that Ep^, is also infinite. We refer to [15] for details.
5.2 Distribution of Termination Time
In this section we present the results of [17] about the distribution of termination time in pPDA. These results do not have immediate algorithmic consequences, but bring a general insight into the behaviour of probabilistic recursive programs. In particular, we show that stochastic computations defined by pPDA are "well-behaved" in the sense that if their expected termination time is finite, then the actual termination time is exponentially unlikely to deviate from this expectation (i.e., the probability of performing a run of length n decays exponentially in «).
Let us fix a pPDA A — (Q,r,^,Prob) and a constant reward function / : Q x r —>• {0,1} which assigns 1 to all configurations. Similarly as in Section 5.1, for all p,q G Q and X G F we use Run(pXq) to denote Reach(pX,{qe}), and [pXq] to denote the probability of Run(pXq). If [pXq] > 0, then Epxq denotes
30
Tomáš Brázdil et al.
K(Acc | Rim(pXq)) where qe is the only target configuration and / is the underlying reward function.
Our aim is to show that for all p,q E Q and X E T such that [pXq] > 0, there are essentially three possibilities:
- There is a "small" k EN such that £P(Acc > n \ Rim(pXq)) — 0 for all n>k.
- Epxq is finite and £P(Acc > n \ Rim(pXq)) decreases exponentially in n.
- Epxq is infinite and £P(Acc > n \ Rim(pXq)) decreases "polynomially" in n.
A precise formulation of this result is given below.
Due to the translation presented in Section 4, we can equivalently consider pBPA where each stack symbol terminates with probability 1 or 0. Let X,Y be stack symbols. We say that X depends on Y if X can reach a configuration with head Y. Since the symbols which terminate with probability 0 are not interesting from the current point of view, they can be safely removed (obviously, the symbols which terminate with probability 1 do not depend on symbols which terminate with probability 0). So, for the rest of this section we assume that A — (F, ^ ,Prob) is a pBPA where every X E F terminates with probability 1. For every a E F*, we use Ea to denote the expected value of Acc over Run(a) where the only target configuration is e (note that a terminates with probability 1). Observe that the dependence relation partitions F into "strongly connected components" formed by symbols that depend on each other, and one can also order these components into a directed acyclic graph (DAG) which has some finite height h > 1 (here the height of a DAG consisting of a single strongly connected component is defined as 1).
Theorem 11 (see [17,16]) Let A — (F, ^ ,Prob) be a pBPA where every X E F terminates with probability 1. Let xm,„ be the minimal probability assigned to a rule of A. Then for every Xq E F, one of the following is true (where Acc is interpreted as a random variable over Run(Xo)):
where d\ — 18/z|F|/x^', and di — \j (2h+1 — 2). Here, h is the height of the DAG of strongly connected components of the dependence relation, and c is a suitable positive constant depending on A.
One can effectively distinguish between the three cases set out in Theorem 11. More precisely, case (1) can be recognized in polynomial time by looking only at the structure of the pBPA, i.e., disregarding the probabilities. Determining whether Ex0 is finite or infinite can be done in polynomial space by Theorem 8. This holds even if
(1)
(2)
0>(Acc >2lrl) =0.
Ex0 is finite and for all n G N with n > 2Ex0 we have that
x"min < ^(Acc>n) < exp 1
(3)
Here, Emax — max{Ex \ X depends on Xq} < °°
Ex0 is infinite and there is no EN such that for all n>no we have that c/y/n < SP(Acc>n) < di/nd2
Probabilistic Pushdown Automata
31
the transition probabilities of A are represented just symbolically by formulae of the existential fragment of Tarski algebra.
The proof of Theorem 11 is based on designing suitable martingales that are used to analyze the concentration of the termination probability. Here we only sketch the proof for the upper bound of Theorem 11 (2), which is perhaps the most interesting part. Observe that for every a G F* such that a ^ e, Ea < °°, and a terminates with probability one, performing one transition from a decreases the expected termination time by one on average. For every w G Rim(Xo), we denote by I(w) the maximal number j > 0 such that w(j — 1) ^ £. For every i > 0, we put
m(,)(w) = +min{z',/(w)}
It is easy to see that E(m(!+1) | mW) — mW, i.e., ra(°),raW,... is a martingale. Let £mas — max{£'x | X depends on Xq}, and let n > 2Ex0 ■ By applying Azuma's inequality (see Section 2.1) we obtain
For every w G Run(Xo) we have that w(n) ^ £ implies raM > n. It follows:
0>(Acc>n) < &>(m(n)>n) < exp (2Ex°~) < expfl--
The proof of Theorem 11 (3) is also based on designing and analysing a suitable martingale, but the argument is more technical. We refer to [17] for details.
6 Mean Payoff and Other Long-Run Properties
In this section we indicate how to analyze long-run average properties, such as the expected mean payoff, for pPDA. The key idea, originally presented in [25,14], is to abstract the behaviour of a given pPDA A into a finite-state Markov chain X&, where possible evolutions of individual stack symbols are "summarized" and explicitly quantified. Since Xa preserves important behavioural aspects of A, the questions about long-run average properties of A can be reformulated as questions about X^ and solved by standard methods for finite-state Markov chains.
The original definition of X^ and the underlying analysis given in [25,14] are somewhat technical. In this section, we follow a simpler approach enabled by the translation presented in Section 4. That is, we introduce the chain X^ only for the special type of pBPA obtained by this translation, which is remarkably simple. Then, we show how X^ can be used to compute/approximate the expected mean payoff for this special type of pBPA. For completeness, we also show how to lift the obtained results to general pPDA. Let us note that the chain X^ is also used in Section 7.1 to solve the model-checking problem for pPDA and linear-time logics.
For the rest of this section, we fix a pBPA A — (F, ^ ,Prob), where F is partitioned into two disjoint subsets F| and Fj- of all symbols that terminate with probability 1 and 0, respectively (cf. Section 4). Obviously, if X^a and X G F|, then
32
Tomáš Brázdil et al.
0.7
Fig. 3 The chain XA for the pBPA of Example 6.
B 0.8 e,
B, B 0.1 BB,
AA, B 0.1 AA,
a G Fj*. Similarly, if x ^ a and x G Fj-, then a contains at least one symbol of Fj-, and we can safely assume that a G Fj*Fj-.
Example 6 Let 4 be a pBPA with stack symbols A, B, X, Y, z, where X^AY, Y^BX, Z^AX, a ^>
01 04 02 02
x ^> BYi Y ^AY, z ^4 B7, a ^4
x^f, y^4z, z^>az, a ^>
X^\ AZ, Y ^4 az, z ^> bz. ThenF^ = {a,b} andFf = {x,7,z}. Now we can define the promised Markov chain .
Definition 8 Let X4 be a finite-state Markov chain where Fj- is the set of vertices, and X A Y is a transition of Xa iff x — J\ y y > 0.
For the pBPA A of Example 6, the chain X4 is shown in Fig. 3.
Let Y G Fj-. Obviously, for almost every w G Run(M&,Y) there is an infinite increasing sequence of indexes z'o, i\such that w(i) G Fj- iff z — ij for some j G Z-°. The sequence w(z'o), w(/j),... is a run in X4 initiated in Y, which we call the footprint of w. Further, for every j G Z-°, the j'-f/zywmp of w is the finite path w(ij)... w(ij+\) in Ma . Hence, the transitions of x4 represent the jumps in . One can easily verify that the mapping Y which to every run w G Run(M&,Y) assigns its footprint w G Run(XA, Y) is defined almost surely and preserves probability.
Let C\,..., C„ be the bottom strongly connected components (BSCCs) of X4. For every 1 (ReachxA(Y,Ci)).
Note that £"=1Pi — 1- Further, let %i be the unique invariant distribution for Q (see, e.g., [21]). Then for almost all w G ReachxA (Y, Q) and every z G Q we have that
lim^M = *(Z)
Probabilistic Pushdown Automata
33
where #lz(w) denotes the number of all indexes j such that 0 < j < i and w(j) — Z. Since the above defined mapping T preserves probability, we obtain that almost every run w G Run(MA, Y) whose footprint w belongs to ReachxA (Y, Q) satisfies
lim z) ; = lim-^^- = Ki{Z) Wr (w) i + 1
where #lz(w) and #^ (w) denote the number all indexes 0 < j < i such that w(j) — Z
and w(j) G I}, respectively.
Now let / : F* —>• R-° be a simple reward function (see Section 2.4). Recall that for every A G we use £^ to denote E(Acc | ReacliMA (A, e)), i.e., the expected total reward accumulated along a terminating path initiated in A before visiting e. Let Z U be a transition of XA, and let E (Z —>• [/) be the expected total reward accumulated along a jump5 represented by Z^U. Observe that
E(Z^U) = /(Z) + £ (2)
Z^tAU
Let C, be a BSCC of XA with the invariant distribution TC{. Then
£ %iZ) -^yEiZ^U) = £ ^-(Z) • /(Z) + £ x£A (3)
is the long-run average reward per jump in a run of Run(MA,Z) whose footprint visits C{. More precisely, for almost all w G Run(MA, Y) whose footprint visits C,- we have that the long-run average reward per jump is defined (i.e., the limit of partial averages computed for more and more jumps in w exists) and is equal to (3).
Similarly, we obtain that the long-run average length of a jump exists for almost all w G Run(MA, Y) whose footprint visits C, and is equal to the same value (observe that the long-run average length of a jump is actually the long-run average reward per jump with respect to the constant reward function 1 which assigns 1 to every configuration; hence, we can just re-use (3) for another reward function).
Since the long-run average reward per jump as well as the long-run average length of a jump are the same for almost all w G Run(MA ,Y) whose footprint visits Q, the long-run average reward per configuration (i.e., the value of the random variables MPsup and MPinf introduced in Section 2.4) should be equal to
Lzec,^(z).(/(z) + Ez4Af/x.£{) ^
for almost all w G Run(MA,Y) whose footprint visits Q. This is correct if (4) is defined, i.e., the numerator or the denominator is finite. If they are both infinite (which may happen), then the problem of computing the expected mean payoff becomes more problematic. We may even have that MPsup(w) ^ MPinf (w) for almost all
5 The last configuration of a jump does not contribute to the total accumulated reward.
34
Tomáš Brázdil et al.
runs w G Run(M&, Y) whose footprint visits Q. A more detailed analysis of this problematic case is still missing in the literature.
Observe that the denominator of (4) is finite if is finite for every A e F|. This motivates the following definition (later we also discuss the problem of computing the expected mean payoff for general pPDA, and therefore we formulate our definition for general pPDA).
Definition 9 Let A — (Q,T, ^, Prob) be a pPDA. We say that A is well-terminating if for all p, q e Q and X E T such that [pXq] > 0 we have that Epxq < °°, where the underlying reward function assigns 1 to every configuration (cf. Section 5.1).
From now on, we assume that the pBPA A fixed above is well-terminating, which means that E\ is finite for all A e Since / is simple, we actually have that is also finite for all A e F|. For every BSCC C, of Xa, we use v, to denote the value defined by (4), and to denote the probability of reaching C, from Y in X^ ■ Now observe the following:
- The invariant distribution %i is the unique solution of the system of linear equations y — y • M, where M is the transition matrix of C,- (see, e.g., [21]).
- The tuple of all E^, where A e F| and E^ > 0, is the unique solution of the system of linear equations Expect A in ((M-0)*, C) (see Section 5.1). Note that the system Expect'A is constructible in polynomial time even if the transition probabilities of A are encoded just symbolically in Tarski algebra (since all E^ are finite, we only need to recognize those E^ that are equal to zero, i.e., we do not rely on the procedure outlined in Remark 3 which requires explicit values of transition probabilities).
Analogous observations hold for the tuple of all E\. Here the situation is even simpler, because all of E\ are positive.
- The probability is a component of the unique solution of an efficiently constructive system of linear equations, where the transition probabilities of X^ are used as coefficients (see, e.g., [5]).
The above observations imply that if the transition probabilities of A are given explicitly as rational numbers, then both v, and pi are rational and computable in polynomial time. If A is obtained by running the translation algorithm of Section 4 with some well-terminating pPDA on input, then the transition probabilities of A are encoded just symbolically in the existential fragment of Tarski algebra (but they are already guaranteed to be positive). In this case, we cannot compute v, and pi directly, but we can still express them in the existential fragment of Tarski algebra by a formula constructive in polynomial time.
Note that £P(MPsup — MPinf — v,) is not necessarily equal to because there can be another BSCC Cj of XA such that Vj — v,, and then 3?(MPsup — MPinf — v,) is at least pi + pj. Since all v,'s are expressible in the existential fragment of Tarski algebra, we can decide whether v, — vy in space polynomial in \\A \\, and thus identify all BSCCs with the same value.
Now we show how to lift the above results to general pPDA. Let us fix a well-terminating pPDA A, a simple reward function / over the configurations of A, and a
Probabilistic Pushdown Automata
35
configuration poXq G Q x F that cannot reach a configuration with empty stack (note that the problem whether a given pPDA A is well-terminating is decidable in polynomial space). If we interpret MPsup and MPinf as random variables over Run(poXo), we obtain the following:
- Let vi,..., vm be the eligible values obtained for the symbol (pqXqI) of the pBPA A. by the method described above, and let pi,...,pm be the associated probabilities. Then almost all runs of Run(poXo) can be split into m disjoint classes £%\S%m such that
- ^(Mi) = pi for all 1 < i < m;
- for almost all w G Mi we have that MPsup(w) — MPinf (w) — v,.
This follows from the correspondence between the runs in A and A. established in Section 4.
Thus, we obtain the following theorem:
Theorem 12 Let A — (Q,T, ^ ,Prob) be a well-terminating pPDA, f: Q x F* —> R-° a simple reward function, and poXo G Q x F a configuration that cannot reach a configuration with empty stack. Then for almost all w G Run(poXo) we have that MPsup(w) — MPinf (w), and there are at most \Q\ ■ |F| distinct values that MPsup and MPinf may assume over Run(poXo) with positive probability. Moreover, these values and the associated probabilities can by approximated up to a given absolute error 8 > 0 in polynomial space.
Let us note that Theorem 12 holds also for linear reward functions.
7 Model-Checking
Finally, we present the existing results about the model-checking problem for pPDA and formulae of linear-time and branching-time logics.
7.1 Linear-Time Logics
Results for pPDA and pBPA. Model-checking pPDA against deterministic Btichi specification was already studied in [25], where it was shown that the quantitative model-checking problem, i.e., the question whether the probability of all runs accepted by a given deterministic Biichi automaton is bounded by a given p G [0,1], is solvable in exponential time. This result was extended to deterministic Muller automata (and hence all o-regular properties) in [19]. The complexity of quantitative and qualitative model-checking problem for pPDA, pBPA and LTL was studied in greater detail in [30], where it was shown that for a given pPDA A and a given LTL formula • ,di„it,R), where D is a finite set of control states, E is a finite input alphabet, —>• CDxZxDisa deterministic and total transition relation, do G D is an initial state, and R — (E\,F\),(En,Fn) is a Rabin acceptance condition, where Ei,F{ C D.
Let u be an infinite word over the alphabet E. A computation of Q) over u is an infinite sequence of states c(u) — do, d\, di,... such that do — d{nit and d{ -^4- d{+\ for all i G Z-°. We say that u is accepted by Si if there is 1 < i < n such that all states of E{ appear in c(u) finitely many times, and at least one state of F{ appears in c(u) infinitely many times.
Let A — (Q,r, ^ ,Prob) be a pPDA, and let pqXq G Q x F be a configuration of A that cannot reach a configuration with empty stack. Further, let S — (D,E, ,dinit,R) be a DRA where E — Q x F is the input alphabet. We say that w G Run(p(jX(j) is recognized by S if the corresponding word poX§ PiXi piXi ■ ■ ■, where /?,-X, is the head of w(i), is accepted by Q). The set of all w G Run(poXo) that are recognized by Q) is denoted by Run(poXo, 2>).
Our aim is to compute/approximate the probability of Run(poXo,@). This is achieved in three steps.
Step I. We compute the synchronized product of A and Q), which is a pPDA A x Q) that behaves like A but also simulates the execution of Q) in its finite control. Hence, the set of control states of A x Q) is QxD, and if pX ^> qa is a rule of A and d d' a transition of Ql, then (p,d)X^- (q, d') a is a rule of A x Ql.
Let w G Run(M^x^, (po,di„it)Xo), and let inf(w) be the set of all d G D such that w visits infinitely many configurations with head of the form (q,d)Y. We say that w is accepting if there is 1 < i < n such that inf(w) n — 0 and inf(w) n F{ ^ 0. Let Run(AxS,(pQ,dinit)X(j,Accept) be the set of all accepting runs of Run(M^x^, (p(j,dinit)X(j). Since Q) just "observes" the computation of A m Ax2> without any real influence, we have that
PiRunipoXo,®)) = ^(Run(Ax2!,(po,dinit)Xo, Accept)).
Probabilistic Pushdown Automata
37
Step II. We translate Axinto the corresponding pBPA Ax3>. by the construction given in Section 4. Let w G Run(M^xS>., ((po,dinit)Xo^)), and let inf(w) be the set of all d G D such that w visits infinitely many configurations with head of the form ((q,d)YQ). Similarly as above, we say that w is accepting if there is 1 < i < n such that inf(w) n E{ — 0 and inf(w) n F{ ^ 0, and we use Run(Ax2>,, ((po,di„it)X()^),Accept) to denote the set of all accepting runs. Due to the correspondence between the runs in M^XS> and ^AxSi. established in Section 4, we can conclude that
(Run(A xS>,(p0, dinit)Xo, Accept)) = £? (Run (Ax3>.,( (p0, dini,)Xot), Accept)).
Another important observation is that if A is a pBPA, then the set of rules of A x2>. is computable in time polynomial in \\A || and \\@\\. This does not follow immediately from Remark 2, because A x is not a pBPA. However, if A is a pBPA, then the control unit of A x just stores the current state of '3, and hence for every configuration dY of A x we have that it terminates with the same probability as the configuration Y of A. Thus, all information need for computing the set of rules of A x 2>. can be obtained by analyzing the pBPA A.
Step III. We construct the Markov chain X^XS>. of Section 6, and classify each BSCC of X^XS>. as either "good" or "bad" with respect to the Rabin condition R. It turns out that almost all runs w G Run(Ax2>., ((p(t,dinit)X^)) whose footprint visits a good BSCC of X^xSi, are accepting, and almost all runs w G Run(Ax2>,, ((po,di„it)X()^)) whose footprint visits a bad BSCC of X^xs. are not accepting. Hence, £P(Run(poXo,3)) is equal to the probability of visiting a good BSCC in Xa x9. , and from this we obtain the desired results.
More concretely, let C be a BSCC of X^xs.- Note that the elements of C are symbols of the form ((q,d)Y^). We compute the set C;nf of all d' G D such that there exists a configuration with head of the form ((r,d')ZQ) reachable from some ((q, d)Y^) G C in the Markov chain x9. ■ Note that C;nf can be computed in polynomial time if the set of rules of A x 2>. is given (the precise probabilities of these rules are irrelevant). In particular, if A is a pBPA, then Cm can be computed in time polynomial in ||4|| and (see above). We declare C as good if there is 1 < i < n such that Cjnf P\E{ — 0 and C;nf P\F{ ^ 0. Now it suffices to realize that for almost every run w G Run(((poXo)X^)) in M& xs>. whose footprint in X^ XS>. visits C we have that inf(w) — Cm{. So, the probability
(Run (A x 9., ((pQ, dinit)X^), Accept))
is equal to the probability of visiting a good BSCC of X^XS>. from ((po,dmit)Xot) in X&xSi.- Since X^XS>. has finitely many states, this probability is a component in the unique solution of an effectively constructive system of linear equations whose coefficients are the transition probabilities of X^XS>. (see> e-g-> [5])- Since the transition probabilities of X^XS>. are effectively expressible in the existential fragment of Tarski algebra (cf. Theorem 2), the same can be said about the probability of our interest. Thus, we obtain the PSPACE upper bounds given in Fig. 4.
If A is a pBPA, then the rules of A x 2>, are computable in time polynomial in \\A || and \\@\\ (see above). This means that the transitions ofX^xS>. are also computable in
38
Tomáš Brázdil et al.
polynomial time (see Definition 8). Hence, we can decide in polynomial time whether all BSCC of X^xSt. reachable from ((po,dinit)Xo'l) good. Thus, we obtain the P upper bound for qualitative DRA properties and pBPA of Fig. 4.
Results for pOC. The qualitative and quantitative model-checking problems for pOC and linear-time properties encoded by DRA was analyzed in [15]. To some extent, the method is similar to the one for pPDA described in the previous paragraph, but the underlying analytical techniques are different. Again, it is shown that the probability of all runs that are recognized6 by a given DRA is equal to the probability of visiting a "good" BSCC in a suitable finite-state Markov chain Y^XS> (the chain Y^XS> is somewhat different from the chain X^xSi, used for pPDA). The set of transitions of Y&xSi is computable in polynomial time, and the "good" BSCCs of Y^XS> are also computable in polynomial time. Thus, we obtain the following:
Theorem 13 Let A = (Q, 8=0,8>Q,P=Q,P>0) be a pOC and <2> = (D,E, ->• ,dinit,R) a DRA where E — Q x {0,1}. For every configuration p(k), where k is encoded in unary, the problem whether almost all runs initiated in p(k) are recognized by is in P.
Further, one can efficiently approximate the probability of reaching a good BSCC in Y&xSi by approximating the values of its transition probabilities and solving the corresponding system of linear equations. The underlying analysis is not completely simple and relies on the divergence gap theorem, which bounds a positive non-termination probability in pOC away from zero (this theorem has been established in [15] by analyzing the martingale constructed in Section 5.1). This leads to the following result:
Theorem 14 Let A = {Q, 8=°, 8>Q,P=Q,P>0) be a pOC, 9 = (D,E, ,dinit,R) a DRA where E — Q x {0,1}, and 8 > 0 a rational constant. For every configuration p(k), where k is encoded in unary, the probability of all runs initiated in p(k) that are recognized by can be approximated up to the relative error 8 in polynomial time on the unit-cost rational arithmetic RAM.
7.2 Branching-Time Logics
The currently known results about model-checking pPDA and pBPA against branching-time formulae of PCTL and PCTL* established in [19,12] are summarized in Fig. 5. The abbreviation "p.c." stands for program complexity, i.e., the respective upper bounds hold for an arbitrary fixed formula, and the lower bounds hold for some fixed formula.
The undecidability of the model-checking problem for pPDA and PCTL is proven by reduction from the Post's correspondence problem (see below). The fundamental idea of encoding words into probabilities which lies behind this proof appeared for the first time in [14]. The undecidability proof for pBPA and PCTL* is obtained as
6 Formally, the "head" of a given pOC configuration p{k) is either (p,0) or (/>, 1), depending on whether k = 0 or k > 0, respectively. The input alphabet of the corresponding DRA is then Q x {0,1}.
Probabilistic Pushdown Automata
39
pBPA
pPDA
PCTL PCTL* qPCTL qPCTL*
undecidable (fixed formula) EXPTIME-complete 2-EXPTIME-complete P P
?
undecidable (fixed formula) undecidable (fixed formula)
qPCTL (p.c.) qPCTL* (p.c.)
EXPTIME-complete 2-EXPTIME-complete EXPTIME-complete EXPTIME-complete
Fig. 5 Summary of results for pPDA/pBPA and branching-time model-checking.
a slight modification of the construction used for pPDA and PCTL. However, the question whether this undecidability result can be extended to pBPA and PCTL is still open.
The 2-EXPTIME (and EXPTIME) upper bounds on model-checking pPDA against qPCTL* (and qPCTL) rely on the results of [30] about LTL model-checking for pPDA. Using these results, one can show that the set of all configurations that almost surely satisfy a given path formula with regular valuations of atomic propositions is also effectively regular, i.e., the associated DFA is effectively constructive. From this one obtains a model checking algorithm for qPCTL*.
The 2-EXPTIME (and EXPTIME) lower bounds on model-checking pPDA against qPCTL* (and qPCTL) are based on standard techniques from non-probabilistic model-checking algorithms [42,7].
We sketch the main idea of the undecidability proof for model-checking pPDA against PCTL formulae. The result is obtained by reduction from (a slightly modified version of) the Post's correspondence problem (PCP). An instance of PCP consists of two sequences x\,... ,x„ and y\,... ,yn of words over the alphabet E — {A,B, •} such that all x, and yj have the same length m. The question is whether there is a finite sequence i\, ■ ■ ■ ,4 of indexes such that x(1 • • -X{k and ytl ■ ■ -y^ are the same words after erasing all occurrences of Given such an instance, we construct a pPDA A where
- the number of control states is ff(m-n), and there are always three distinguished control states g, c, and t;
- the stack alphabet of A is fixed and contains the symbols Z, Y, and nine symbols of the form (z,z') where z,z' EE.
Further, we define a PCTL formula
where the atomic propositions cZ and tY are valid in exactly all configurations with head cZ and tY, respectively. The formula is an abbreviation for
Our construction ensures that the given PCP instance has a solution iff gZ \=v
0(cZA0
l'2tY)
40
Tomáš Brázdil et al.
[cZ(A,A)(A,t)(t,Á)(B,B)ZJ [v(A,A)(A,«)(«,A)(fl,fl)zj (v(A,A)(A,«)(«,A)(fl,fl)z)
I tY {A,»'){», A'){B,B')Z j) [ v(A «)(« A)(fl,fl)Z ~) ( vQV)(*,A)(B,B)Z ~] [ rF(A, «)q,A)(fl,fl)Z ~)
I tf(«,A)(fl,fl)Z j) [ v(*,A)(B,B)Z ~) [ v(«,A)(fl,fl)Z ~)
[ v(B,B)Z ] [ v(fl,fl)Z ] ( rYZ ]
Fig. 6 Configurations reachable from cZ(A,A)(A, •)(»,A)(B,B)Z.
stored as a sequence of three stack symbols (A,B), (A,A), (B,u), where is on
top of the stack. After storing a chosen pair of words, the automaton can either go on with guessing another pair of words, or enter a checking configuration by changing the control state from g to c and pushing the symbol Z on top of the stack. The transition probabilities do not matter here, and hence we assume they are distributed uniformly. This "guessing phase" is formalized below. Note that a pair (x{,yi) needs to be pushed "from right to left" because the top of the stack is on the left-hand side (we use x,- and y,- to denote the reverse of x,- and y,-, respectively). Since transition probabilities are distributed uniformly, we do not write them explicitly. The symbol "|" separates alternatives.
g7+lx
s\x
cZX I gX
six,
j+\mMj))x,
Here 1 < i < n, 1 < j < m, x,(j) and yi(j) denote the jth letters in x,- and y,-, respectively, and X ranges over the stack alphabet.
Obviously, a configuration of the form cZa is reachable from gZ iff a = (a\,bi) ■ ■ ■ (a(,b()Z and there is a sequence i\,... ,it such that ai ■ ■ -a\ — x(1 • • -xtk and b( ■ ■ ■ bi — yt{ ■ ■ ■ yih. The crucial part of the construction is the next phase which verifies that the guess was correct, i.e., that the words stored in the first and the second
Probabilistic Pushdown Automata
41
component of stack symbols are the same when "•" is disregarded. First, we erase Z from the top of the stack and change the control state to either v or v, which intuitively means that we are going to read the letters stored either in the first or in the second component of stack symbols, respectively. Then, we start to erase the symbols from the stack one by one according to the following transition rules (again, the transition probabilities are distributed uniformly):
cZ^ve\ve, v(A,z)^tY | ve, v(z,A)^rY \ve, tY^tY, v(B,z)^rY |ve, v(z,B)^tY \ve, rY^rY
vZ^tY | rY, vZ^tY | rY,
Here z ranges over E. We claim that a checking configuration satisfies the formula ()=ll2tY iff the previous guess was correct. To get some intuition, let us evaluate the probability of reaching a configuration with the head tY for, e.g., a configuration cZ(A,A)(A,9)(9,A)(B,B)Z. By inspecting the part of MA reachable from cZ(A,A)(A,9)(9,A)(B,B)Z (see Fig. 6), one can easily confirm that this probability is equal to
2 V V 2 2 2 24 J \ 2 22 23 24
which can be written in binary as follows: ^(0.1101 +0.0011). The first three digits after the binary point in 0.1101 and 0.0011 reflect the structure of the words AAuB and AuAB stored in the stack, and the last 1 is due to the Z at the very bottom. Note that the role of A in the two words is dual—in the first case, it generates 1 's, and in the second case it generates 0's (similarly for B). The symbol • is just popped from the stack with probability one, which does not influence the probability of reaching a configuration satisfying tY. Note that the probabilities 0.1101 and 0.0011 are "complementary" and their sum is equal to 1. This "complementarity" breaks down iff the words stored in the first and the second component of stack symbols are not the same, in which case the sum is different from 1.
The above construction does not directly work for pBPA, because here we cannot carry any information down the stack when popping the symbols. It is possible to overcome this problem, but the constructed formula becomes more complicated and it is expressible only in PCTL*.
References
1. Allender, E., Biirgisser, P., Kjeldgaard-Pedersen, J., Miltersen, P.: On the complexity of numerical analysis. SIAM Journal of Computing 38, 1987-2006 (2008)
2. Alur, R., Chaudhuri, S., Etessami, K., Madhusudan, P.: On-the-fly reachability and cycle detection for recursive state machines. In: Proceedings of TACAS 2005, Lecture Notes in Computer Science, vol. 3440, pp. 61-76. Springer (2005)
3. Alur, R., Etessami, K., Yannakakis, M.: Analysis of recursive state machines. In: Proceedings of CAV 2001, Lecture Notes in Computer Science, vol. 2102, pp. 207-220. Springer (2001)
4. Athreya, K., Ney, P.: Branching Processes. Springer (1972)
5. Baier, C, Katoen, J.P: Principles of Model Checking. The MIT Press (2008)
42
Tomáš Brázdil et al.
6. Bouajjani, A., Esparza, J., Maler, O.: Reachability analysis of pushdown automata: application to model checking. In: Proceedings of CONCUR'97, Lecture Notes in Computer Science, vol. 1243, pp. 135-150. Springer (1997)
7. Bozzelli, L.: Complexity results on branching-time pushdown model checking. In: Proceedings of VMCAI 2006, Lecture Notes in Computer Science, vol. 3855, pp. 65-79. Springer (2006)
8. Brázdil, T, Brožek, V., Holeček, J., Kučera, A.: Discounted properties of probabilistic pushdown automata. In: Proceedings of LPAR 2008, Lecture Notes in Computer Science, vol. 5330, pp. 230-242. Springer (2008)
9. Brázdil, T, Brožek, V., Etessami, K.: One-counter stochastic games. In: Proceedings of FST&TCS 2010, Leibniz International Proceedings in Informatics, vol. 8, pp. 108-119. Schloss Dagstuhl-Leibniz-Zentrum für Informatik (2010)
10. Brázdil, T, Brožek, V., Etessami, K., Kučera, A.: Approximating the termination value of one-counter MDPs and stochastic games. In: Proceedings of ICALP 2011, Part II, Lecture Notes in Computer Science, vol. 6756, pp. 332-343. Springer (2011)
11. Brázdil, T, Brožek, V., Etessami, K., Kučera, A., Wojtczak, D.: One-counter Markov decision processes. In: Proceedings of SODA 2010, pp. 863-874. SIAM (2010)
12. Brázdil, T, Brožek, V., Forejt, V.: Branching-time model-checking of probabilistic pushdown automata. Electronic Notes in Theoretical Computer Science 239, 73-83 (2009)
13. Brázdil, T, Brožek, V., Forejt, V., Kučera, A.: Reachability in recursive Markov decision processes. Information and Computation 206(5), 520-537 (2008)
14. Brázdil, T, Esparza, J., Kučera, A.: Analysis and prediction of the long-run behavior of probabilistic sequential programs with recursion. In: Proceedings of FOCS 2005, pp. 521-530. IEEE Computer Society Press (2005)
15. Brázdil, T, Kiefer, S., Kučera, A.: Efficient analysis of probabilistic programs with an unbounded counter. In: Proceedings of CAV 2011, Lecture Notes in Computer Science, vol. 6806, pp. 208-224. Springer (2011)
16. Brázdil, T, Kiefer, S., Kučera, A., Hutařová Vařeková, L: Runtime analysis of probabilistic programs with unbounded recursion. CoRR abs/1007.1710 (2010)
17. Brázdil, T, Kiefer, S., Kučera, A., Hutařová Vařeková, L: Runtime analysis of probabilistic programs with unbounded recursion. In: Proceedings of ICALP 2011, Part II, Lecture Notes in Computer Science, vol. 6756, pp. 319-331. Springer (2011)
18. Brázdil, T, Kučera, A., Strážovský, O.: Deciding probabilistic bisimilarity over infinite-state probabilistic systems. In: Proceedings of CONCUR 2004, Lecture Notes in Computer Science, vol. 3170, pp. 193-208. Springer (2004)
19. Brázdil, T, Kučera, A., Strážovský, O.: On the decidability of temporal properties of probabilistic pushdown automata. In: Proceedings of STACS 2005, Lecture Notes in Computer Science, vol. 3404, pp. 145-157. Springer (2005)
20. Canny, J.: Some algebraic and geometric computations in PSPACE. In: Proceedings of STOC'88, pp. 460-467. ACM Press (1988)
21. Chung, K.: Markov Chains with Stationary Transition Probabilities. Springer (1967)
22. Emerson, E.: Temporal and modal logic. Handbook of Theoretical Computer Science B, 995-1072 (1991)
23. Esparza, J., Gaiser, A., Kiefer, S.: Computing least fixed points of probabilistic systems of polynomials. In: Proceedings of STACS 2010, Leibniz International Proceedings in Informatics, vol. 5, pp. 359-370. Schloss Dagstuhl-Leibniz-Zentrum für Informatik (2010)
24. Esparza, J., Hansel, D., Rossmanith, P., Schwoon, S.: Efficient algorithms for model checking pushdown systems. In: Proceedings of CAV 2000, Lecture Notes in Computer Science, vol. 1855, pp. 232-247. Springer (2000)
25. Esparza, J., Kučera, A., Mayr, R.: Model-checking probabilistic pushdown automata. In: Proceedings of LICS 2004, pp. 12-21. IEEE Computer Society Press (2004)
26. Esparza, J., Kučera, A., Mayr, R.: Quantitative analysis of probabilistic pushdown automata: Expectations and variances. In: Proceedings of LICS 2005, pp. 117-126. IEEE Computer Society Press (2005)
27. Etessami, K., Stewart, A., Yannakakis, M.: Polynomial time algorithms for multi-type branching processes and stochastic context-free grammars. In: Proceedings of STOC 2012, pp. 579-588. ACM Press (2012)
28. Etessami, K., Wojtczak, D., Yannakakis, M.: Quasi-birth-death processes, tree-like QBDs, probabilistic 1-counter automata, and pushdown systems. In: Proceedings of 5th Int. Conf. on Quantitative Evaluation of Systems (QEST'08). IEEE Computer Society Press (2008)
Probabilistic Pushdown Automata
43
29. Etessami, K., Yannakakis, M.: Algorithmic verification of recursive probabilistic systems. In: Proceedings of TACAS 2005, Lecture Notes in Computer Science, vol. 3440, pp. 253-270. Springer (2005)
30. Etessami, K., Yannakakis, M.: Checking LTL properties of recursive Markov chains. In: Proceedings of 2nd Int. Conf. on Quantitative Evaluation of Systems (QEST'05), pp. 155-165. IEEE Computer Society Press (2005)
31. Etessami, K., Yannakakis, M.: Recursive Markov chains, stochastic grammars, and monotone systems of non-linear equations. In: Proceedings of STACS 2005, Lecture Notes in Computer Science, vol. 3404, pp. 340-352. Springer (2005)
32. Etessami, K., Yannakakis, M.: Recursive Markov chains, stochastic grammars, and monotone systems of nonlinear equations. Journal of the Association for Computing Machinery 56 (2009)
33. Grigoriev, D.: Complexity of deciding Tarski algebra. Journal of Symbolic Computation 5(1-2), 65-108 (1988)
34. Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal Aspects of Computing 6, 512-535 (1994)
35. Harris, T.: The Theory of Branching Processes. Springer (1963)
36. Hopcroft, J., Ullman, J.: Introduction to Automata Theory, Languages, and Computation. Addison-Wesley (1979)
37. Kiefer, S., Luttenberger, M., Esparza, J.: On the convergence of Newton's method for monotone systems of polynomial equations. In: Proceedings of STOC 2007, pp. 217-226. ACM Press (2007)
38. Mayr, R.: Strict lower bounds for model checking BPA. Electronic Notes in Theoretical Computer Science 18 (1998)
39. Rosenthal, J.: A first look at rigorous probability theory. World Scientific Publishing (2006)
40. Tarski, A.: A Decision Method for Elementary Algebra and Geometry. Univ. of California Press, Berkeley (1951)
41. Tarski, A.: A lattice-theoretical fixpoint theorem and its applications. Pacific Journal of Mathematics 5(2), 285-309 (1955)
42. Walukiewicz, I.: Model checking CTL properties of pushdown systems. In: Proceedings of FST&TCS'2000, Lecture Notes in Computer Science, vol. 1974, pp. 127-138. Springer (2000)
43. Williams, D.: Probability with Martingales. Cambridge University Press (1991)