Automata theory
An algorithmic approach
0
Lecture Notes
Javier Esparza
July 20, 2012
2
Chapter 9
Automata and Logic
A regular expression can be seen as a set of instructions ( a ‘recipe’) for generating the words of a
language. For instance, the expression aa(a + b)∗b can be interpreted as “write two a’s, repeatedly
write a or b an arbitrary number of times, and then write a b”. We say that regular expressions are
an operational description language.
Languages can also be described in declarative style, as the set of words that satisfy a property.
For instance, “the words over {a, b} containing an even number of a’s and an even number of b’s” is
a declarative description. A language may have a simple declarative description and a complicated
operational description as a regular expression. For instance, the regular expression
(aa + bb + (ab + ba)(aa + bb)∗
(ba + ab))∗
is a natural operational description of the language above, and it is arguably less intuitive than
the declarative one. This becomes even more clear if we consider the language of the words over
{a, b, c} containing an even number of a’s, of b’s, and of c’s.
In this chapter we present a logical formalism for the declarative description of regular languages.
We use logical formulas to describe properties of words, and logical operators to construct
complex properties out of simpler ones. We then show how to automatically translate a formula
describing a property of words into an automaton recognizing the words satisfying the property.
As a consequence, we obtain an algorithm to convert declarative into operational descriptions, and
vice versa.
9.1 First-Order Logic on Words
In declarative style, a language is deﬁned by its membership predicate, i.e., the property that words
must satisfy in order to belong to it. Predicate logic is the standard language to express membership
predicates. Starting from some natural, “atomic” predicates, more complex ones can be constructed
through boolean combinations and quantiﬁcation. We introduce atomic predicates Qa(x), where a
is a letter, and x ranges over the positions of the word. The intended meaning is “the letter at
159
160 CHAPTER 9. AUTOMATA AND LOGIC
position x is an a.” For instance, the property “all letters are as” is formalized by the formula
∀x Qa(x).
In order to express relations between positions we add to the syntax the predicate x < y, with
intended meaning “position x is smaller than (i.e., lies to the left of) position y”. For example, the
property “if the letter at a position is an a, then all letters to the right of this position are also as” is
formalized by the formula
∀x∀y ((Qa(x) ∧ x < y) → Qa(y)) .
Deﬁnition 9.1 Let V = {x, y, z, . . .} be an inﬁnite set of variables, and let Σ = {a, b, c, . . .} be a ﬁnite
alphabet. The set FO(Σ) of ﬁrst-order formulas over Σ is the set of expressions generated by the
grammar:
ϕ := Qa(x) | x < y | ¬ϕ | (ϕ ∨ ϕ) | ∃x ϕ .
As usual, variables within the scope of an existential quantiﬁer are bounded, and otherwise free. A
formula without free variables is a sentence. Sentences of FO(Σ) are interpreted on words over Σ.
For instance, ∀x Qa(x) is true for the word aa, but false for word ab. Formulas with free variables
cannot be interpreted on words alone: it does not make sense to ask whether Qa(x) holds for the
word ab or not. A formula with free variables is interpreted over a pair (w, I), where I assigns to
each free variable (and perhaps to others) a position in the word. For instance, Qa(x) is true for the
pair (ab, x → 1), because the letter at position 1 of ab is a, but false for (ab, x → 2).
Deﬁnition 9.2 An interpretation of a formula ϕ of FO(Σ) is a pair (w, I) where w ∈ Σ∗ and I is a
mapping that assigns to every free variable x a position I(x) ∈ {1, . . . , |w|} (the mapping may also
assign positions to other variables).
Notice that if ϕ is a sentence then a pair (w, E), where E is the empty mapping that does not assign
any position to any variable, is an interpretation of ϕ. Instead of (w, E) we write simply w.
We now formally deﬁne when an interpretation satisﬁes a formula. Given a word w and a
number k, let w[k] denote the letter of w at position k.
Deﬁnition 9.3 The satisfaction relation (w, I) |= ϕ between a formula ϕ of FO(Σ) and an interpretation
(w, I) of ϕ is deﬁned by:
(w, I) |= Qa(x) iff w[I(x)] = a
(w, I) |= x < y iff I(x) < I(y)
(w, I) |= ¬ϕ iff (w, I) |= ϕ
(w, I) |= ϕ1 ∨ ϕ2 iff (w, I) |= ϕ1 or (w, I) |= ϕ2
(w, I) |= ∃x ϕ iff |w| ≥ 1 and some i ∈ {1, . . . , |w|} satisﬁes (w, I[i/x]) |= ϕ
where w[i] is the letter of w at position i, and I[i/x] is the mapping that assigns i to x and otherwise
coincides with I. (Notice that I may not assign any value to x.) If (w, I) |= ϕ we say that (w, I) is a
model of ϕ. Two formulas are equivalent if they have the same models.
9.1. FIRST-ORDER LOGIC ON WORDS 161
It follows easily from this deﬁnition that if two interpretations (w, I1) and (w, I2) of ϕ diﬀer
only in the positions assigned by I1 and I2 to bounded variables, then either both interpretations
are models of ϕ, or none of them is. In particular, whether an interpretation (w, I) of a sentence is
a model or not depends only on w, not on I.
We use some standard abbreviations:
∀x ϕ := ¬∃ x¬ϕ ϕ1 ∧ ϕ2 := ¬ (¬ϕ1 ∨ ¬ϕ2) ϕ1 → ϕ2 := ¬ϕ1 ∨ ϕ2
Notice that according to the deﬁnition of the satisfaction relation the empty word satisﬁes no
formulas of the form ∃x ϕ, and all formulas of the form ∀x ϕ. While this causes no problems for
our purposes, it is worth noticing that in other contexts it may lead to complications. For instance,
the formulas ∃x Qa(x) and ∀y∃x Qa(x) do not hold for exactly the same words, because the empty
word satisﬁes the second, but not the ﬁrst. Further useful abbreviations are:
ﬁrst(x) := ¬∃y y < x “x is the ﬁrst position”
last(x) := ¬∃y x < y “x is the last position”
y = x + 1 := x < y ∧ ¬∃ z(x < z ∧ z < y) “y is the successor position of x”
y = x + 2 := ∃ z(z = x + 1 ∧ y = z + 1)
y = x + (k + 1) := ∃ z(z = x + k ∧ y = z + 1)
Example 9.4 Some examples of properties expressible in the logic:
• “The last letter is a b and before it there are only a’s.”
∃x Qb(x) ∧ ∀x (last(x) → Qb(x) ∧ ¬last(x) → Qa(x))
• “Every a is immediately followed by a b.”
∀x (Qa(x) → ∃y (y = x + 1 ∧ Qb(y)))
• “Every a is immediately followed by a b, unless it is the last letter.”
∀x (Qa(x) → ∀y (y = x + 1 → Qb(y)))
• “Between every a and every later b there is a c.”
∀x∀y (Qa(x) ∧ Qb(y) ∧ x < y → ∃z (x < z ∧ z < y ∧ Qc(z)))
162 CHAPTER 9. AUTOMATA AND LOGIC
9.1.1 Expressive power of FO(Σ)
Once we have deﬁned which words satisfy a sentence, we can associate to a sentence the set of
words satisfying it.
Deﬁnition 9.5 The language L(ϕ) of a sentence ϕ ∈ FO(Σ) is the set L(ϕ) = {w ∈ Σ∗ | w |= φ}. We
also say that ϕ expresses L(ϕ). A language L ⊆ Σ∗ is FO-deﬁnable if L = L(ϕ) for some formula ϕ
of FO(Σ).
The languages of the properties in the example are FO-deﬁnable by deﬁnition. To get an idea
of the expressive power of FO(Σ), we prove a theorem characterizing the FO-deﬁnable languages
in the case of a 1-letter alphabet Σ = {a}. In this simple case we only have one predicate Qa(x),
which is always true in every interpretation. So every formula is equivalent to a formula without
any occurrence of Qa(x). For example, the formula ∃y (Qa(y) ∧ y < x) is equivalent to ∃y y < x.
We prove that a language over a one-letter alphabet is FO-deﬁnable if and only if it is ﬁnite
or co-ﬁnite, where a language is co-ﬁnite if its complement is ﬁnite. So, for instance, even a
simple language like {an | n is even } is not FO-deﬁnable. The plan of the proof is as follows.
First, we deﬁne the quantiﬁer-free fragment of FO({a}), denoted by QF; then we show that 1-letter
languages are QF-deﬁnable iff they are ﬁnite or co-ﬁnite; ﬁnally, we prove that 1-letter languages
are FO-deﬁnable iff they are QF-deﬁnable.
For the deﬁnition of QF we need some more macros whose intended meaning should be easy
to guess:
x + k < y := ∃z (z = x + k ∧ z < y)
x < y + k := ∃z (z = y + k ∧ x < z)
k < last := ∀x (last(x) → x > k)
In these macros k is a constant, that is, k < last standa for the inﬁnite family of macros 1 < last, 2 <
last, 3 < last . . .. Macros like k > x or x + k > y are deﬁned similarly.
Deﬁnition 9.6 The logic QF (for quantiﬁer-free) is the fragment of FO({a}) with syntax
f := x ≈ k | x ≈ y + k | k ≈ last | f1 ∨ f2 | f1 ∧ f2
where ≈ ∈ {<, >} and k ∈ N.
Proposition 9.7 A language over a 1-letter alphabet is QF-deﬁnable iff it is ﬁnite or co-ﬁnite.
Proof: (⇒): Let f be a sentence of QF. Since QF does not have quantiﬁers, f does not contain
any occurrence of a variable, and so it is a positive (i.e., negation-free) boolean combination of
formulas of the form k < last or k > last. We proceed by induction on the structure of f. If
f = k < last, then L(ϕ) is co-ﬁnite, and if f = k > last, then L(ϕ) is ﬁnite. If f = f1 ∨ f2, then by
induction hypothesis L(f1) and L(f2) are ﬁnite or co-ﬁnite; if L(f1) and L( f2) are ﬁnite, then so is
L(f), and otherwise L(f) is co-ﬁnite. The case f = f1 ∧ f2 is similar.
9.2. MONADIC SECOND-ORDER LOGIC ON WORDS 163
(⇐): A ﬁnite language {ak1 , . . . , akn } is expressed by the formula (last > k1 − 1 ∧ last <
k1 + 1) ∨ . . . ∨ (last > k1 − 1 ∧ last < k1 + 1). To express a co-ﬁnite language, it suﬃces to show
that for every formula f of QF expressing a language L, there is another formula f expressing the
language L. This is easily proved by induction on the structure of the formula.
Theorem 9.8 Every formula ϕ of FO({a}) is equivalent to a formula f of QF.
Proof: Sketch. By induction on the structure of ϕ. If ϕ(x, y) = x < y, then ϕ ≡ y < x + 0. If
ϕ = ¬ψ, the result follows from the induction hypothesis and the fact that negations can be removed
using De Morgan’s rules and equivalences like ¬(x < y + k) ≡ x ≥ y + k. If ϕ = ϕ1 ∨ ϕ2, the result
follows directly from the induction hypothesis. Consider now the case ϕ = ∃x ψ. By induction
hypothesis, ψ is equivalent to a formula f of QF, and we can assume that f is in disjunctive normal
form, say f = D1 ∨ . . . ∨ Dn. Then ϕ ≡ ∃x D1 ∨ ∃x D2 ∨ . . . ∨ ∃x Dn, and so it suﬃces to ﬁnd a
formula fi of QF equivalent to ∃x Di.
The formula fi is a conjunction of formulas containing all conjuncts of Di with no occurrence
of x, plus other conjuncts obtained as follows. For every lower bound x < t1 of Di, where t1 = k1
or t1 = x1 + k1, and every upper bound of the form x > t2, where t2 = k1 or t2 = x1 + k1 we add to
fi a conjunct equivalent to t2 + 1 < t1. For instance, y + 7 < x and x < z + 3 we add y + 5 < z. It is
easy to see that fi ≡ ∃x Di.
Corollary 9.9 The language Even = {a2n | n ≥ 0} is not ﬁrst-order expressible.
These results show that ﬁrst-order logic cannot express all regular languages, not even over a
1-letter alphabet. For this reason we now introduce monadic second-order logic.
9.2 Monadic Second-Order Logic on Words
Monadic second-order logic extends ﬁrst-order logic with variables X, Y, Z, . . . ranging over sets of
positions, and with predicates x ∈ X, meaning “position x belongs to the set X. 1 It is allowed
to quantify over both kinds of variables. Before giving a formal deﬁnition, let us informally see
how this extension allows to describe the language Even. The formula states that the last position
belongs to the set of even positions. A position belongs to this set iff it is the second position, or
the second successor of another position in the set.
The following formula states that X is the set of even positions:
second(x) := ∃y (ﬁrst(y) ∧ x = y + 1)
Even(X) := ∀x (x ∈ X ↔ (second(x) ∨ ∃y (x = y + 2 ∧ y ∈ X)))
For the complete formula, we observe that the word has even length if its last position is even:
EvenLength := ∃X (Even(X) ∧ ∀x (last(x) → x ∈ X) )
1
More generally, second-order logic allows for variables ranging over relations of arbitrary arity. The monadic
fragment only allows arity 1, which corresponds to sets.
164 CHAPTER 9. AUTOMATA AND LOGIC
We now deﬁne the formal syntax and semantics of the logic.
Deﬁnition 9.10 Let X1 = {x, y, z, . . .} and X2 = {X, Y, Z, . . .} be two inﬁnite sets of ﬁrst-order and
second-order variables. Let Σ = {a, b, c, . . .} be a ﬁnite alphabet. The set MSO(Σ) of monadic
second-order formulas over Σ is the set of expressions generated by the grammar:
ϕ := Qa(x) | x < y | x ∈ X | ¬ϕ | ϕ ∨ ϕ | ∃x ϕ | ∃X ϕ
An interpretation of a formula ϕ is a pair (w, I) where w ∈ Σ∗, and I is a mapping that assigns every
free ﬁrst-order variable x a position I(x) ∈ {1, . . . , |w|} and every free second-order variable X a
set of positions I(X) ⊆ {1, . . . , |w|}. (The mapping may also assign positions to other variables.)
The satisfaction relation (w, I) |= ϕ between a formula ϕ of MSO(Σ) and an interpretation (w, I)
of ϕ is deﬁned as for FO(Σ), with the following additions:
(w, I) |= x ∈ X iff I(x) ∈ I(X)
(w, I) |= ∃X ϕ iff |w| > 0 and some S ⊆ {1, . . . , |w|}
satisﬁes (w, I[S/X]) |= ϕ
where I[S/X] is the interpretation that assigns S to X and otherwise coincides with I — whether
I is deﬁned for X or not. If (w, I) |= ϕ we say that (w, I) is a model of ϕ. Two formulas are
equivalent if they have the same models. The language L(ϕ) of a sentence ϕ ∈ MSO(Σ) is the set
L(ϕ) = {w ∈ Σ∗ | w |= φ}. A language L ⊆ Σ∗ is MSO-deﬁnable if L = L(ϕ) for some formula
ϕ ∈ MSO(Σ).
Notice that in this deﬁnition the set S may be empty. So, for instance, ay interpretation that assigns
the empty set to X is a model of the formula ∃X ∀x ¬(x ∈ X).
We use the standard abbreviations
∀x ∈ X ϕ := ∀x (x ∈ X → ϕ) ∃x ∈ X ϕ := ∃x (x ∈ X ∧ ϕ)
9.2.1 Expressive power of MSO(Σ)
We show that the languages expressible in monadic second-order logic are exactly the regular
languages. We start with an example.
Example 9.11 Let Σ = {a, b, c, d}. We construct a formula of MSO(Σ) expressing the regular
language c∗(ab)∗d∗. The membership predicate of the language can be informally formulated as
follows:
There is a block of consecutive positions X such that: before X there are only c’s; after
X there are only d’s; in X b’s and a’s alternate; the ﬁrst letter in X is an a and the last
letter is a b.
The predicate is a conjunction of predicates. We give formulas for each of them.
9.2. MONADIC SECOND-ORDER LOGIC ON WORDS 165
• “X is a block of consecutive positions.”
Cons(X) := ∀x ∈ X ∀y ∈ X (x < y → (∀z (x < z ∧ z < y) → z ∈ X))
• “x lies before/after X.”
Before(x, X) := ∀y ∈ X x < y After(x, X) := ∀y ∈ X y < x
• “Before X there are only c’s.”
Before only c(X) := ∀x Before(x, X) → Qc(x)
• “After X there are only d’s.”
After only d(X) := ∀x After(x, X) → Qd(x)
• “a’s and b’s alternate in X.”
Alternate(X) := ∀x ∈ X ( Qa(x) → ∀y ∈ X (y = x + 1 → Qb(y) )
∧
Qb(x) → ∀y ∈ X (y = x + 1 → Qa(y) ) )
• ”The ﬁrst letter in X is an a and the last is a b.”
First a(X) := ∀x ∈ X ∀y (y < x → ¬y ∈ X) → Qa(x)
Last b(X) := ∀x ∈ X ∀y (y > x → ¬y ∈ X) → Qa(x)
Putting everything together, we get the formula
∃X( Cons(X) ∧ Before only c(X) ∧ After only d(X) ∧
Alternate(X) ∧ First a(X) ∧ Last b(X) )
Notice that the empty word is a model of the formula. because the empty set of positions satisﬁes
all the conjuncts.
Let us now directly prove one direction of the result.
Proposition 9.12 If L ⊆ Σ∗ is regular, then L is expressible in MSO(Σ).
166 CHAPTER 9. AUTOMATA AND LOGIC
Proof: Let A = (Q, Σ, δ, q0, F) be a DFA with Q = {q0, . . . , qn} and L(A) = L. We construct a
formula ϕA such that for every w , w |= ϕA iff w ∈ L(A). If ∈ L(A), then we can extend the
formula to ϕA ∨ ϕA, where ϕA is only satisﬁed by the empty word (e.g. ϕA = ∀x x < x).
We start with some notations. Let w = a1 . . . am be a word over Σ, and let
Pq = i ∈ {1, . . . , m} | ˆδ(q0, a0 . . . ai) = q .
In words, i ∈ Pq iff A is in state q immediately after reading the letter ai. Then A accepts w iff
m ∈ q∈F Pq.
Assume we were able to construct a formula Visits(X0, . . . Xn) with free variables X0, . . . Xn
such that I(Xi) = Pqi holds for every model (w, I) and for every 0 ≤ i ≤ n. In words, Visits(X0, . . . Xn)
is only true when Xi takes the value Pqi for every 0 ≤ i ≤ n. Then (w, I) would be a model of
ψA := ∃X0 . . . ∃Xn Visits(X0, . . . Xn) ∧ ∃x

last(x) ∧
qi∈F
x ∈ Xi


iff w has a last letter, and w ∈ L. So we could take
ϕA :=
ψA if q0 F
ψA ∨ ∀x x < x if q0 ∈ F
Let us now construct the formula Visits(X0, . . . Xn). The sets Pq are the unique sets satisfying
the following properties:
(a) 1 ∈ Pδ(q0,a1), i.e., after reading the letter at position 1 the DFA is in state δ(q0, a1);
(b) every position i belongs to exactly one Pq, i.e., the Pq’s build a partition of the set positions;
and
(c) if i ∈ Pq and δ(q, ai+1) = q then i + 1 ∈ Pq , i.e., the Pq’s “respect” the transition function δ.
We express these properties through formulas. For every a ∈ Σ, let qia = δ(q0, a). The formula for
(a) is:
Init(X0, . . . , Xn) = ∃x

ﬁrst(x) ∧


a∈Σ
(Qa(x) ∧ x ∈ Xia )




(in words: if the letter at position 1 is a, then the position belongs to Xia ).
Formula for (b):
Partition(X0, . . . , Xn) = ∀x


n
i=0
x ∈ Xi ∧
n
i, j = 0
i j
(x ∈ Xi → x Xj)


9.2. MONADIC SECOND-ORDER LOGIC ON WORDS 167
Formula for (c):
Respect(X0, . . . , Xn) = ∀x∀y


y = x + 1 →
a ∈ Σ
i, j ∈ {0, . . . , n}
δ(qi, a) = qj
(x ∈ Xi ∧ Qa(x) ∧ y ∈ Xj)


Altogether we get
Visits(X0, . . . Xn) := Init(X0, . . . , Xn) ∧ Partition(X0, . . . , Xn) ∧ Respect(X0, . . . , Xn)
It remains to prove that MSO-deﬁnable languages are regular. Given a sentence ϕ ∈ MSO(Σ)
show that L(ϕ) is regular by induction on the structure of ϕ. However, since the subformulas of a
sentence are not necessarily sentences, the language deﬁned by the subformulas of ϕ is not deﬁned.
We correct this. Recall that the interpretations of a formula are pairs (w, I) where I assigns positions
to the free ﬁrst-order variables and sets of positions to the free second-order variables. For example,
if Σ = {a, b} and if the free ﬁrst-order and second-order variables of the formula are x, y and X, Y,
respectively, then two possible interpretations are


aab ,
x → 1
y → 3
X → {2, 3}
Y → {1, 2}




ba ,
x → 2
y → 1
X → ∅
Y → {1}


Given an interpretation (w, I), we can encode each assignment x → k or X → {k1, . . . , kl} as a
bitstring of the same length as w: the string for x → k contains exactly a 1 at position k, and 0’s
everywhere else; the string for X → {k1, . . . , kl} contains 1’s at positions k1, . . . , kl, and 0’s everywhere
else. After ﬁxing an order on the variables, an interpretation (w, I) can then be encoded as a
tuple (w, w1, . . . , wn), where n is the number of variables, w ∈ Σ∗, and w1, . . . , wn ∈ {0, 1}∗. Since all
of w, w1, . . . , wn have the same length, we can as in the case of transducers look at (w, w1, . . . , wn)
as a word over the alphabet Σ × {0, 1}n. For the two interpretations above we get the encodings
x
y
X
Y
a
1
0
0
1
a
0
0
1
1
b
0
1
1
0
and
x
y
X
Y
b
0
1
0
1
a
1
0
0
0
corresponding to the words
168 CHAPTER 9. AUTOMATA AND LOGIC


a
1
0
0
1




a
0
0
1
1




b
0
1
1
0


and


b
0
1
0
1




a
1
0
0
0


over Σ × {0, 1}4
Deﬁnition 9.13 Let ϕ be a formula with n free variables, and let (w, I) be an interpretation of ϕ.
We denote by enc(w, I) the word over the alphabet Σ × {0, 1}n described above. The language of ϕ
is L(ϕ) = {enc(w, I) | (w, I) |= ϕ}.
Now that we have associated to every formula ϕ a language (whose alphabet depends on the
free variables), we prove by induction on the structure of ϕ that L(ϕ) is regular. We do so by
exhibiting automata (actually, transducers) accepting L(ϕ). For simplicity we assume Σ = {a, b},
and denote by free(ϕ) the set of free variables of ϕ.
• ϕ = Qa(x). Then free(ϕ) = x, and the interpretations of ϕ are encoded as words over Σ×{0, 1}.
The language L(ϕ) is given by
L(ϕ) =



a1
b1
. . .
. . .
ak
bk
k ≥ 0,
ai ∈ Σ and bi ∈ {0, 1} for every i ∈ {1, . . . , k}, and
bi = 1 for exactly one index i ∈ {1, . . . , k} such that ai = a



and is recognized by
a
0
,
b
0
a
0
,
b
0
a
1
• ϕ = x < y. Then free(ϕ) = {x, y}, and the interpretations of φ are encoded as words over
Σ × {0, 1}2. The language L(ϕ) is given by
L(ϕ) =





a1
b1
c1


. . .
. . .


ak
bk
ck


k ≥ 0,
ai ∈ Σ and bi, ci ∈ {0, 1} for every i ∈ {1, . . . , k},
bi = 1 for exactly one index i ∈ {1, . . . , k},
cj = 1 for exactly one index j ∈ {1, . . . , k}, and
i < j



and is recognized by
9.2. MONADIC SECOND-ORDER LOGIC ON WORDS 169


a
0
1




b
0
1




a
0
0




b
0
0




a
0
0


,


b
0
0




a
0
0




b
0
0




a
1
0




b
1
0


• ϕ = x ∈ X. Then free(ϕ) = {x, X}, and interpretations are encoded as words over Σ × {0, 1}2.
The language L(ϕ) is given by
L(ϕ) =





a1
b1
c1


. . .
. . .


ak
bk
ck


k ≥ 0,
ai ∈ Σ and bi, ci ∈ {0, 1} for every i ∈ {1, . . . , k},
bi = 1 for exactly one index i ∈ {1, . . . , k}, and
for every i ∈ {1, . . . , k}, if bi = 1 then ci = 1



and is recognized by


a
0
0


,


b
0
0


,


a
0
1


,


b
0
1




a
0
0


,


b
0
0


,


a
0
1


,


b
0
1




a
1
1


,


b
1
1


• ϕ = ¬ψ. Then free(ϕ) = free(ψ), and by induction hypothesis there exists an automaton Aψ
s.t. L(Aψ) = L(ψ).
Observe that L(ϕ) is not in general equal to L(ψ). To see why, consider for example the case
ψ = Qa(x) and ϕ = ¬Qa(x). The word
a
1
a
1
a
1
belongs neither to L(ψ) nor L(ϕ), because it is not the encoding of any interpretation: the
bitstring for x contains more than one 1. What holds is L(ϕ) = L(ψ) ∩ Enc(ψ), where
Enc(ψ) is the language of the encodings of all the interpretations of ψ (whether they are
models of ψ or not). We construct an automaton Aenc
ψ recognizing Enc(ψ), and so we can
take Aϕ = Aψ ∩ Aenc
ψ .
Assume ψ has k ﬁrst-order variables. Then a word belongs to Enc(ψ) iff each of its projections
onto the 2nd, 3rd, ..., (k + 1)-th component is a bitstring containing exactly one 1. As
states of Aenc
ψ we take all the strings {0, 1}k. The intended meaning of a state, say state 101
for the case k = 3, is “the automaton has already read the 1’s in the bitstrings of the ﬁrst and
third variables, but not yet read the 1 in the second.” The initial and ﬁnal states are 0k and
1k, respectively. The transitions are deﬁned according to the intended meaning of the states.
For instance, the automaton Aenc
x<y is
170 CHAPTER 9. AUTOMATA AND LOGIC


a
1
0


,


b
1
0




a
0
1


,


b
0
1




a
0
1


,


b
0
1




a
1
1


,


b
1
1




a
0
0


,


b
0
0




a
0
0


,


b
0
0




a
1
0


,


b
1
0




a
0
0


,


b
0
0




a
0
0


,


b
0
0


Observe that the number of states of Aenc
ψ grows exponentially in the number of free variables.
This makes the negation operation expensive, even when the automaton Aφ is deterministic.
• ϕ = ϕ1 ∨ ϕ2. Then free(ϕ) = free(ϕ1) ∪ free(ϕ2), and by induction hypothesis there are
automata Aϕi , Aϕ2 such that L(Aϕ1 ) = L(ϕ1) and L(Aϕ2 ) = L(ϕ2).
If free(ϕ1) = free(ϕ2), then we can take Aϕ = Aϕ1 ∪ Aϕ2 . But this need not be the case. If
free(ϕ1) free(ϕ2), then L(ϕ1) and L(ϕ2) are languages over diﬀerent alphabets Σ1, Σ2, or
over the same alphabet, but with diﬀerent intended meaning, and we cannot just compute
their intersection. For example, if ϕ1 = Qa(x) and ϕ2 = Qb(y), then both L(ϕ1) and L(ϕ2)
are languages over Σ × {0, 1}, but the second component indicates in the ﬁrst case the value
of x, in the second the value of y.
This problem is solved by extending L(ϕ1) and L(Aϕ2 ) to languages L1 and L2 over Σ×{0, 1}2.
In our example, the language L1 contains the encodings of all interpretations (w, {x →
n1, y → n2}) such that the projection (w, {x → n1}) belongs to L(Qa(x)), while L2 contains
the interpretations such that (w, {y → n2}) belongs to L(Qb(y)). Now, given the automaton
AQa(x) recognizing L(Qa(x))
a
0
,
b
0
a
1
a
0
,
b
0
we transform it into an automaton A1 recognizing L1
9.2. MONADIC SECOND-ORDER LOGIC ON WORDS 171


a
0
0


,


a
0
1


,


b
0
0


,


b
0
1




a
0
0


,


a
0
1


,


b
0
0


,


b
0
1




a
1
0


,


a
1
1


After constructing A2 similarly, take Aϕ = A1 ∪ A2.
• ϕ = ∃x ψ. Then free(ϕ) = free(ψ) \ {x}, and by induction hypothesis there is an automaton
Aψ s.t. L(Aψ) = L(ψ). Deﬁne A∃ xψ as the result of the projection operation, where we
project onto all variables but x. The operation simply corresponds to removing in each letter
of each transition of Aσ the component for variable x. For example, the automaton A∃x Qa(x)
is obtained by removing the second components in the automaton for AQa(x) shown above,
yielding
a, b a, b
a
Observe that the automaton for ∃x ψ can be nondeterministic even if the one for ψ is deterministic,
since the projection operation may map diﬀerent letters into the same one.
• ϕ = ∃X ϕ. We proceed as in the previous case.
Size of Aϕ. The procedure for constructing Aϕ proceeds bottom-up on the syntax tree of ϕ. We
ﬁrst construct automata for the atomic formulas in the leaves of the tree, and then proceed upwards:
given automata for the children of a node in the tree, we construct an automaton for the node itself.
Whenever a node is labeled by a negation, the automaton for it can be exponentially bigger
than the automaton for its only child. This yields an upper bound for the size of Aϕ equal to a tower
of exponentials, where the height of the tower is equal to the largest number of negations in any
path from the root of the tree to one of its leaves.
It can be shown that this very large upper bound is essentially tight: there are formulas for
which the smallest automaton recognizing the same language as the formula reaches the upper
bound. This means that MSO-logic allows to describe some regular languages in an extremely
succinct form.
Example 9.14 Consider the alphabet Σ = {a, b} and the language a∗b ⊆ Σ∗, recognized by the
NFA
a
b
172 CHAPTER 9. AUTOMATA AND LOGIC
We derive this NFA by giving a formula ϕ such that L(ϕ) = a∗b, and then using the procedure
described above. We shall see that the procedure is quite laborious. The formula states that the last
letter is b, and all other letters are a’s.
ϕ = ∃x (last(x) ∧ Qb(x)) ∧ ∀x (¬last(x) → Qa(x))
We ﬁrst bring ϕ into the equivalent form
ψ = ∃x (last(x) ∧ Qb(x)) ∧ ¬∃x (¬last(x) ∧ ¬Qa(x))
We transform ψ into an NFA. First, we compute an automaton for last(x) = ¬∃y x < y. Recall
that the automaton for x < y is


a
0
1




b
0
1




a
0
0




b
0
0




a
0
0


,


b
0
0




a
0
0




b
0
0




a
1
0




b
1
0


[x < y]
Applying the projection operation, we get following automaton for ∃y x < y
a
0
,
b
0
a
0
,
b
0
a
0
,
b
0
a
1
,
b
1
a
0
,
b
0
[∃y x < y]
Recall that computing the automaton for the negation of a formula requires more than complementing
the automaton. First,we need an automaton recognizing Enc(∃y x < y).
a
1
,
b
1
a
0
,
b
0
a
0
,
b
0
Second, we determinize and complement the automaton for ∃y x < y:
a
0
,
b
0
Σ × {0, 1}
a
1
,
b
1
Σ × {0, 1}
9.2. MONADIC SECOND-ORDER LOGIC ON WORDS 173
And ﬁnally, we compute the intersection of the last two automata, getting
a
0
,
b
0
a
1
,
b
1
a
0
,
b
0
a
0
,
b
0
whose last state is useless and can be removed, yielding the following NFA for last(x):
a
0
,
b
0
a
1
,
b
1
[last(x)]
Next we compute an automaton for ∃x (last(x) ∧ Qb(x)) , the ﬁrst conjunct of ψ. We start with an
NFA for Qb(x)
b
1
a
0
,
b
0
a
0
,
b
0
[Qb(x)]
The automaton for ∃x (last(x) ∧ Qb(x)) is the result of intersecting this automaton with the NFA
for last(x) and projecting onto the ﬁrst component. We get
a, b
b
[∃x (last(x) ∧ Qb(x))]
Now we compute an automaton for ¬∃x (¬last(x) ∧ ¬Qa(x)), the second conjunct of ψ. We ﬁrst
obtain an automaton for ¬Qa(x) by intersecting the complement of the automaton for Qa(x) and
the automaton for Enc(Qa(x). The automaton for Qa(x) is
a
1
a
0
,
b
0
a
0
,
b
0
[Qa(x)]
174 CHAPTER 9. AUTOMATA AND LOGIC
and after determinization and complementation we get
a
1
b
1
a
1
b
1
Σ × {0, 1}
a
0
b
0
a
0
b
0
For the automaton recognizing Enc(Qa(x)), notice that Enc(Qa(x)) = Enc(∃y x < y), because both
formulas have the same free variables, and so the same interpretations. But we have already computed
an automaton for Enc(∃y x < y), namely
a
1
,
b
1
a
0
,
b
0
a
0
,
b
0
The intersection of the last two automata yields a three-state automaton for ¬Qa(x), but after eliminating
a useless state we get
b
1
a
0
,
b
0
a
0
,
b
0
[¬Qa(x)]
Notice that this is the same automaton we obtained for Qb(x), which is ﬁne, because over the
alphabet {a, b} the formulas Qb(x) and ¬Qa(x) are equivalent.
To compute an automaton for ¬last(x) we just observe that ¬last(x) is equivalent to ∃y x < y,
for which we have already compute an NFA, namely
a
0
,
b
0
a
0
,
b
0
a
0
,
b
0
a
1
,
b
1
a
0
,
b
0
[¬last(x)]
9.2. MONADIC SECOND-ORDER LOGIC ON WORDS 175
Intersecting the automata for ¬last(x) and ¬Qa(x), and subsequently projecting onto the ﬁrst component,
we get an automaton for ∃x (¬last(x) ∧ ¬Qa(x))
a, b
b
a, b a, b
a, b
[∃x (¬last(x) ∧ ¬Qa(x))]
Determinizing, complementing, and removing a useless state yields the following NFA for ¬∃x (¬last(x)∧
¬Qa(x)):
a
b
[¬∃x (¬last(x) ∧ ¬Qa(x))]
Summarizing, the automata for the two conjuncts of ψ are
a, b
b
and
a
b
whose intersection yields a 3-state automaton, which after removal of a useless state becomes
a
b
[∃x (last(x) ∧ Qb(x)) ∧ ¬∃x (¬last(x) ∧ ¬Qa(x))]
ending the derivation.
Exercises
Exercise 76 Characterize the languages described by the following formulas and give a corresponding
automaton:
1. ∃x ﬁrst(x)
2. ∀x ﬁrst(x)
3. ¬∃x∃y (x < y∧ Qa(x)∧ Qb(y)) ∧ ∀x (Qb(x) → ∃y x < y∧ Qa(y)) ∧ ∃x ¬∃y (x < y∧ Qa(x))
176 CHAPTER 9. AUTOMATA AND LOGIC
Exercise 77 Give a deﬁning MSO-formula, an automaton, and a regular expression for the following
languages over {a, b}.
• The set of words of even length and containing only a’s or only b’s.
• The set of words, where between each two b’s with no other b in between there is a block of
an odd number of letters a.
• The set of words with odd length and an odd number of occurrences of a.
Exercise 78 For every n ≥ 1, give a FO-formula of polynomial length in n abbreviating y = x+2n.
(Notice that the abbreviation y = x + k of page 9.1 has length O(k), and so it cannot be directly
used.) Use it to give another FO-formula ϕn, also of of polynomial length in n, for the language
Ln = {ww ∈ {a, b}∗ | |w| = 2n}.
Remark: Since the minimal DFA for Ln has 22n
states (Exercise 10), this shows that the number of
states of a minimal automaton equivalent to a given FO-formula may be double exponential in the
length of the formula.
Exercise 79 MSO over a unary alphabet can be used to automatically prove some simple properties
of the natural numbers. Consider for instance the following property: every ﬁnite set of natural
numbers has a minimal element2. It is easy to see that this property holds iff the formula
∀Z∃x∀y (y ∈ Z → (x ≤ y ∧ x ∈ Z))
is a tautology, i.e., if it is satisﬁed by every word. Construct an automaton for the formula, and
check that it is universal.
Exercise 80 Give formulas ϕ1, . . . , ϕ4 for the following abbreviations:
Sing(X) := ϕ1 “X is a singleton, i.e., X contains one element”
X ⊆ Y := ϕ2 “X is a subset of Y”
X ⊆ Qa := ϕ3 “every position of X contains an a”
X < Y := ϕ4 “X and Y are singletons X = {x} and Y = {y} satisfying x < y”
Exercise 81 Express addition in MSO({a}. More precisely, ﬁnd a formula +(X, Y, Z) of MSO({a})
that is true iﬀ x + y = z, where x, y, z are the numbers encoded by the sets X, Y, Z, respectively, in
lsbf encoding. You may use any abbrevation deﬁned in the chapter.
2
Of course, this also holds for every inﬁnite set, but we cannot prove it using MSO over ﬁnite words.
9.2. MONADIC SECOND-ORDER LOGIC ON WORDS 177
Exercise 82 The nesting depth d(ϕ) of a formula ϕ of FO({a}) is deﬁned inductively as follows:
• d(Qa(x)) = d(x < y) = 0;
• d(¬ϕ) = d(ϕ), d(ϕ1 ∨ ϕ2) = max{d(ϕ1), d(ϕ2)}; and
• d(∃x ϕ) = 1 + d(ϕ).
Prove that every formula ϕ of FO({a}) of nesting depth n is equivalent to a formula f of QF having
the same free variables as ϕ, and such that every constant k appearing in f satisﬁes k ≥ 2n.
Hint: The proof is similar to that of Theorem 9.8. The diﬃcult case is the one where ϕ has the
form ∃x ψ and ψ is a conjunction. Deﬁne f as the following conjunction. All conjuncts of ψ not
containing x are also conjuncts of f; for every conjunct of D of the form x ≥ k or x ≥ y + k, f
contains a conjunct last ≥ k; for every two conjuncts of D containing x, f contains a conjunct
obtained by “quantifying x away”: for example, if the conjuncts are x ≥ k1 and y ≥ x + k2, then
f has the conjunct y ≥ k1 + k2. Since the constants in the new conjuncts are the sum of two old
constants, the new constants are bounded by 2 · 2d = 2d+1.