Leonid Libkin
Elements of Finite Model Theory
With 24 Figures
February 7, 2012
Springer
Berlin Heidelberg NewYork
Hong Kong London
Milan Paris Tokyo
To Helen, Daniel, and Victoria
Preface
Finite model theory is an area of mathematical logic that grew out of computer
science applications.
The main sources of motivational examples for ﬁnite model theory are
found in database theory, computational complexity, and formal languages,
although in recent years connections with other areas, such as formal methods
and veriﬁcation, and artiﬁcial intelligence, have been discovered.
The birth of ﬁnite model theory is often identiﬁed with Trakhtenbrot’s
result from 1950 stating that validity over ﬁnite models is not recursively
enumerable; in other words, completeness fails over ﬁnite models. The technique
of the proof, based on encoding Turing machine computations as ﬁnite
structures, was reused by Fagin almost a quarter century later to prove his celebrated
result that put the equality sign between the class NP and existential
second-order logic, thereby providing a machine-independent characterization
of an important complexity class. In 1982, Immerman and Vardi showed that
over ordered structures, a ﬁxed point extension of ﬁrst-order logic captures
the complexity class Ptime of polynomial time computable properties. Shortly
thereafter, logical characterizations of other important complexity classes were
obtained. This line of work is often referred to as descriptive complexity.
A diﬀerent line of ﬁnite model theory research is associated with the development
of relational databases. By the late 1970s, the relational database
model had replaced others, and all the basic query languages for it were essentially
ﬁrst-order predicate calculus or its minor extensions. In 1974, Fagin
showed that ﬁrst-order logic cannot express the transitive closure query over
ﬁnite relations. In 1979, Aho and Ullman rediscovered this result and brought
it to the attention of the computer science community. Following this, Chandra
and Harel proposed a ﬁxed-point extension of ﬁrst-order logic on ﬁnite
relational structures as a query language capable of expressing queries such
as the transitive closure. Logics over ﬁnite models have become the standard
starting point for developing database query languages, and ﬁnite model theory
techniques are used for proving results about their expressiveness and
complexity.
VIII Preface
Yet another line of work on logics over ﬁnite models originated with B¨uchi’s
work from the early 1960s: he showed that regular languages are precisely
those deﬁnable in monadic second-order logic over strings. This line of work
is the automata-theoretic counterpart of descriptive complexity: instead of
logical characterizations of time/space restrictions of Turing machines, one
provides such characterizations for weaker devices, such as automata. More
recently, connections between database query languages and automata have
been explored too, as the ﬁeld of databases started moving away from relations
to more complex data models.
In general, ﬁnite model theory studies the behavior of logics on ﬁnite structures.
The reason this is a separate subject, and not a tiny chapter in classical
model theory, is that most standard model-theoretic tools (most notably, compactness)
fail over ﬁnite models. Over the past 25–30 years, many tools have
been developed to study logics over ﬁnite structures, and these tools helped
answer many questions about complexity theory, databases, formal languages,
etc.
This book is an introduction to ﬁnite model theory, geared towards theoretical
computer scientists. It grew out of my ﬁnite model theory course,
taught to computer science graduate students at the University of Toronto.
While teaching that course, I realized that there is no single source that covers
all the main areas of ﬁnite model theory, and yet is suitable for computer
science students. There are a number of excellent books on the subject. Finite
Model Theory by Ebbinghaus and Flum was the ﬁrst standard reference and
heavily inﬂuenced the development of the ﬁeld, but it is a book written for
mathematicians, not computer scientists. There is also a nice set of notes by
V¨a¨an¨anen, available on the web. Immerman’s Descriptive Complexity deals
extensively with complexity-theoretic aspects of ﬁnite model theory, but does
not address other applications. Foundations of Databases by Abiteboul, Hull,
and Vianu covers many database applications, and Thomas’s chapter “Languages,
automata, and logic” in the Handbook of Formal Languages describes
connections between logic and formal languages. Given the absence of a single
source for all the subjects, I decided to write course notes, which eventually
became this book.
The reader is assumed to have only the most basic computer science and
logic background: some discrete mathematics, theory of computation, complexity,
propositional and predicate logic. The book also includes a background
chapter, covering logic, computability theory, and computational complexity.
In general, the book should be accessible to senior undergraduate students in
computer science.
A note on exercises: there are three kinds of these. Some are the usual
exercises that the reader should be able to do easily after reading each chapter.
If I indicate that an exercise comes from a paper, it means that its level could
range from moderately to extremely diﬃcult: depending on the exact level,
such an “exercise” could be a question on a take-home exam, or even a course
Preface IX
project, whose main goal is to understand the paper where the result is proven.
Such exercises also gave me the opportunity to mention a number of interesting
results that otherwise could not have been included in the book. There are
also exercises marked with an asterisk: for these, I do not know solutions.
It gives me the great pleasure to thank my colleagues and students for
their help. I received many comments from Marcelo Arenas, Pablo Barcel´o,
Michael Benedikt, Ari Brodsky, Anuj Dawar, Ron Fagin, Arthur Fischer, Lauri
Hella, Christoph Koch, Janos Makowsky, Frank Neven, Juha Nurmonen, Ben
Rossman, Luc Segouﬁn, Thomas Schwentick, Jan Van den Bussche, Victor
Vianu, and Igor Walukiewicz. Ron Fagin, as well as Yuri Gurevich, Alexander
Livchak, Michael Taitslin, and Vladimir Sazonov, were also very helpful with
historical comments. I taught two courses based on this book, and students
in both classes provided very useful feedback; in addition to those I already
thanked, I would like to acknowledge Antonina Kolokolova, Shiva Nejati, Ken
Pu, Joseph Rideout, Mehrdad Sabetzadeh, Ramona Truta, and Zheng Zhang.
Despite their great eﬀort, mistakes undoubtedly remain in the book; if you
ﬁnd one, please let me know. My email is libkin@cs.toronto.edu.
Many people in the ﬁnite model theory community inﬂuenced my view
of the ﬁeld; it is impossible to thank them all, but I want to mention Scott
Weinstein, from whom I learned ﬁnite model theory, and immediately became
fascinated with the subject.
Finally, I thank Ingeborg Mayer, Alfred Hofmann, and Frank Holzwarth
at Springer-Verlag for editorial assistance, and Denis Th´erien for providing
ideal conditions for the ﬁnal proofreading of the book.
This book is dedicated to my wife, Helen, and my son, Daniel. Daniel
was born one week after I ﬁnished teaching a ﬁnite model theory course in
Toronto, and after several sleepless nights I decided that perhaps writing a
book is the type of activity that goes well with the lack of sleep. By the time
I was writing Chap. 6, Daniel had started sleeping through the night, but at
that point it was too late to turn back. And without Helen’s help and support
I certainly would not have ﬁnished this book in only two years.
Toronto, Ontario, Canada
May 2004 Leonid Libkin
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 A Database Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 An Example from Complexity Theory . . . . . . . . . . . . . . . . . . . . . . 4
1.3 An Example from Formal Language Theory. . . . . . . . . . . . . . . . . 6
1.4 An Overview of the Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1 Background from Mathematical Logic . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Background from Automata and Computability Theory . . . . . . 17
2.3 Background from Complexity Theory . . . . . . . . . . . . . . . . . . . . . . 19
2.4 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 Ehrenfeucht-Fra¨ıss´e Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1 First Inexpressibility Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Deﬁnition and Examples of Ehrenfeucht-Fra¨ıss´e Games . . . . . . . 26
3.3 Games and the Expressive Power of FO . . . . . . . . . . . . . . . . . . . . 32
3.4 Rank-k Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5 Proof of the Ehrenfeucht-Fra¨ıss´e Theorem . . . . . . . . . . . . . . . . . . 35
3.6 More Inexpressibility Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.7 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4 Locality and Winning Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.1 Neighborhoods, Hanf-locality, and Gaifman-locality . . . . . . . . . . 45
4.2 Combinatorics of Neighborhoods . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3 Locality of FO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.4 Structures of Small Degree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.5 Locality of FO Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.6 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
XII Contents
5 Ordered Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.1 Invariant Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.2 The Power of Order-invariant FO . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.3 Locality of Order-invariant FO . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.4 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6 Complexity of First-Order Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.1 Data, Expression, and Combined Complexity . . . . . . . . . . . . . . . 87
6.2 Circuits and FO Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.3 Expressive Power with Arbitrary Predicates. . . . . . . . . . . . . . . . . 93
6.4 Uniformity and AC0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.5 Combined Complexity of FO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.6 Parametric Complexity and Locality . . . . . . . . . . . . . . . . . . . . . . . 99
6.7 Conjunctive Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.8 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7 Monadic Second-Order Logic and Automata . . . . . . . . . . . . . . . 113
7.1 Second-Order Logic and Its Fragments . . . . . . . . . . . . . . . . . . . . . 113
7.2 MSO Games and Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.3 Existential and Universal MSO on Graphs . . . . . . . . . . . . . . . . . . 119
7.4 MSO on Strings and Regular Languages . . . . . . . . . . . . . . . . . . . . 124
7.5 FO on Strings and Star-Free Languages . . . . . . . . . . . . . . . . . . . . 127
7.6 Tree Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.7 Complexity of MSO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.8 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
7.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
8 Logics with Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.1 Counting and Unary Quantiﬁers . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.2 An Inﬁnitary Counting Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.3 Games for L∗
∞ω(Cnt) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
8.4 Counting and Locality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
8.5 Complexity of Counting Quantiﬁers . . . . . . . . . . . . . . . . . . . . . . . . 155
8.6 Aggregate Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
8.7 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
8.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
9 Turing Machines and Finite Models . . . . . . . . . . . . . . . . . . . . . . . 165
9.1 Trakhtenbrot’s Theorem and Failure of Completeness . . . . . . . . 165
9.2 Fagin’s Theorem and NP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
9.3 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
9.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
Contents XIII
10 Fixed Point Logics and Complexity Classes . . . . . . . . . . . . . . . . 177
10.1 Fixed Points of Operators on Sets . . . . . . . . . . . . . . . . . . . . . . . . . 178
10.2 Fixed Point Logics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
10.3 Properties of LFP and IFP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
10.4 LFP, PFP, and Polynomial Time and Space . . . . . . . . . . . . . . . . 192
10.5 Datalog and LFP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
10.6 Transitive Closure Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
10.7 A Logic for Ptime? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
10.8 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
10.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
11 Finite Variable Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
11.1 Logics with Finitely Many Variables . . . . . . . . . . . . . . . . . . . . . . . 211
11.2 Pebble Games. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
11.3 Deﬁnability of Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
11.4 Ordering of Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
11.5 Canonical Structures and the Abiteboul-Vianu Theorem . . . . . . 229
11.6 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
11.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
12 Zero-One Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
12.1 Asymptotic Probabilities and Zero-One Laws . . . . . . . . . . . . . . . 235
12.2 Extension Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
12.3 The Random Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
12.4 Zero-One Law and Second-Order Logic . . . . . . . . . . . . . . . . . . . . . 243
12.5 Almost Everywhere Equivalence of Logics . . . . . . . . . . . . . . . . . . 245
12.6 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
12.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
13 Embedded Finite Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
13.1 Embedded Finite Models: the Setting . . . . . . . . . . . . . . . . . . . . . . 249
13.2 Analyzing Embedded Finite Models. . . . . . . . . . . . . . . . . . . . . . . . 252
13.3 Active-Generic Collapse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
13.4 Restricted Quantiﬁer Collapse. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
13.5 The Random Graph and Collapse to MSO . . . . . . . . . . . . . . . . . . 265
13.6 An Application: Constraint Databases . . . . . . . . . . . . . . . . . . . . . 267
13.7 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
13.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
14 Other Applications of Finite Model Theory . . . . . . . . . . . . . . . . 275
14.1 Finite Model Property and Decision Problems. . . . . . . . . . . . . . . 275
14.2 Temporal and Modal Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
14.3 Constraint Satisfaction and Homomorphisms of Finite Models . 285
14.4 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
XIV Contents
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
1
Introduction
Finite model theory studies the expressive power of logics on ﬁnite models.
Classical model theory, on the other hand, concentrates on inﬁnite structures:
its origins are in mathematics, and most objects of interest in mathematics
are inﬁnite, e.g., the sets of natural numbers, real numbers, etc. Typical examples
of interest to a model-theorist would be algebraically closed ﬁelds (e.g.,
C, +, · ), real closed ﬁelds (e.g., R, +, ·, < ), various models of arithmetic
(e.g., N, +, · or N, + ), and other structures such as Boolean algebras or
random graphs.
The origins of ﬁnite model theory are in computer science where most objects
of interest are ﬁnite. One is interested in the expressiveness of logics over
ﬁnite graphs, or ﬁnite strings, other ﬁnite relational structures, and sometimes
restrictions of arithmetic structures to an initial segment of natural numbers.
The areas of computer science that served as a primary source of examples,
as well as the main consumers of techniques from ﬁnite model theory, are
databases, complexity theory, and formal languages (although ﬁnite model
theory found applications in other areas such as AI and veriﬁcation). In this
chapter, we give three examples that illustrate the need for studying logics
over ﬁnite structures.
1.1 A Database Example
While early database systems used rather ad hoc data models, from the early
1970s the world switched to the relational model. In that model, a database
stores tables, or relations, and is queried by a logic-based declarative language.
The most standard such language, relational calculus, has precisely
the power of ﬁrst-order predicate calculus. In real life, it comes equipped with
a specialized programming syntax (e.g., the select-from-where statement of
SQL).
Suppose that we have a company database, and one of its relations is the
Reports To relation: it stores pairs (x, y), where x is an employee, and y is
2 1 Introduction
his/her immediate manager. Organizational hierarchies tend to be quite complicated
and often result in many layers of management, so one may want to
skip the immediate manager level and instead look for the manager’s manager.
In SQL, this would be done by the following query:
select R1.employee, R2.manager
from Reports_To R1, Reports_To R2
where R1.manager=R2.employee
This is simply a diﬀerent way of writing the following ﬁrst-order logic
formula:
ϕ(x, y) ≡ ∃z Reports To(x, z) ∧ Reports To(z, y) .
Continuing, we may ask for someone’s manager’s manager’s manager:
∃z1∃z2 Reports To(x, z1) ∧ Reports To(z1, z2) ∧ Reports To(z2, y) ,
and so on.
But what if we want to ﬁnd everyone who is higher in the hierarchy than
a given employee? Speaking graph-theoretically, if we associate a pair (x, y)
in the Reports To relation with a directed edge from x to y in a graph, then
we want to ﬁnd, for a given node, all the nodes reachable from it. This does
not seem possible in ﬁrst-order logic, but how can one prove this?
There are other queries naturally related to this reachability property.
Suppose that once in a while, the company wants to make sure that its management
hierarchy is logically consistent; that is, we cannot have cycles in the
Reports To relation. In graph-theoretic terms, it means that Reports To is
acyclic. Again, if one thinks about it for a while, it seems that ﬁrst-order logic
does not have enough power to express this query.
We now consider a diﬀerent kind of query. Suppose we have two managers,
x and y, and let X be the set of all the employees directly managed by x (i.e.,
all x′
such that (x′
, x) is in Reports To), and likewise let Y be the set of all
the employees directly managed by y. Can we write a query asking whether
|X | = |Y |; that is, a query asking whether x and y have the same number of
people reporting to them?
It turns out that ﬁrst-order logic is again not suﬃciently expressive for
this kind of query, but since queries like those described above are so common
in practice, SQL adds special features to the language to perform them. That
is, SQL can count: it can apply the cardinality function (and more complex
functions as well) to entire columns in relations. For example, in SQL one can
write a query that ﬁnds all pairs of managers x and y who have the same
number of people reporting to them:
1.1 A Database Example 3
select R1.manager, R2.manager
from Reports_To R1, Reports_To R2
where (select count(Reports_To.employee)
from Reports_To
where Reports_To.manager = R1.manager)
= (select count(Reports_To.employee)
from Reports_To
where Reports_To.manager = R2.manager)
Since this cannot be done in ﬁrst-order logic, but can be done in SQL (and,
in fact, in some rather simple extensions of ﬁrst-order logic with counting), it
is natural to ask whether counting provides enough expressiveness to deﬁne
queries such as reachability (can node x be reached from node y in a given
graph?) and acyclicity.
Typical applications of ﬁnite model theory in databases have to deal with
questions of this sort: what can, and, more importantly, what cannot, be
expressed in various query languages.
Let us now give intuitive reasons why reachability queries are not expressible
in ﬁrst-order logic. Consider a diﬀerent example. Suppose that we have
an airline database, with a binary relation R (for routes), such that an entry
(A, B) in R indicates that there is a ﬂight from A to B. Now suppose we want
to ﬁnd all pairs of cities A, B such that there is a direct ﬂight between them;
this is done by the following query:
q0(x, y) ≡ R(x, y),
which is simply a ﬁrst-order formula with two free variables. Next, suppose
we want to know if one can get from x to y with exactly one change of plane;
then we write
q1(x, y) ≡ ∃z R(x, z) ∧ R(z, y).
Doing “with at most one change” means having a disjunction
Q1(x, y) ≡ q1(x, y) ∨ q0(x, y).
Clearly, for each ﬁxed k we can write a formula stating that one can get from
x to y with exactly k stops:
qk(x, y) ≡ ∃z1 . . . ∃zk R(x, z1) ∧ R(z1, z2) ∧ . . . ∧ R(zk, y),
as well as Qk = j≤k qj testing if at most k stops suﬃce.
But what about the reachability query: can we get from x to y? That is,
one wants to compute the transitive closure of R. The problem with this is
that we do not know in advance what k is supposed to be. So the query that
we need to write is
k∈N
qk,
4 1 Introduction
but this is not a ﬁrst-order formula! Of course this is not a formal proof that
reachability is not expressible in ﬁrst-order logic (we shall see a proof of this
fact in Chap. 3), but at least it gives a hint as to what the limitations of
ﬁrst-order logic are.
The inability of ﬁrst-order logic to express some important queries motivated
a lot of research on extensions of ﬁrst-order logic that can do queries
such as transitive closure or cardinality comparisons. We shall see a number
of extensions of these kinds – ﬁxed point logics, (fragments of) second-order
logic, counting logics – that are important for database theory, and we shall
study properties of these extensions as well.
1.2 An Example from Complexity Theory
We now turn to a diﬀerent area, and to more expressive logics. Suppose that
we have a graph, this time undirected, given to us as a pair V, E , where V
is the set of vertices, or nodes, and E is the edge relation. Assume that now
we can specify graph properties in second-order logic; that is, we can quantify
over sets (or relations) of nodes.
Consider a well-known property of Hamiltonicity. A simple circuit in a
graph G is a sequence (a1, . . . , an) of distinct nodes such that there are
edges (a1, a2), (a2, a3), . . . , (an−1, an), (an, a1). A simple circuit is Hamiltonian
if V = {a1, . . . , an}. A graph is Hamiltonian if it has a Hamiltonian circuit.
We now consider the following formula:
∃L ∃S




linear order(L)
∧ S is the successor relation of L
∧ ∀x∃y (L(x, y) ∨ L(y, x))
∧ ∀x∀y (S(x, y) → E(x, y))



 (1.1)
The quantiﬁers ∃L ∃S state the existence of two binary relations, L and S,
that satisfy the formula in parentheses. That formula uses some abbreviations.
The subformula linear order(L) in (1.1) states that the relation L is a linear
ordering; it can be deﬁned as
∀x¬L(x, x) ∧ ∀x∀y∀z (L(x, y) ∧ L(y, z) → L(x, z))
∧ ∀x∀y (x = y) → L(x, y) ∨ L(y, x) .
The subformula S is the successor relation of L states that S is the successor
relation associated with the linear ordering L; it can be deﬁned as
∀x∀y S(x, y) ↔
L(x, y) ∧ ¬∃z L(x, z) ∧ L(z, y)
∨ ¬∃z L(x, z) ∧ ¬∃z L(z, y)
1.2 An Example from Complexity Theory 5
Note that S is the circular successor relation, as it also includes the pair (x, y)
where x is the maximal and y the minimal element with respect to L.
Then (1.1) says that L and S are deﬁned on all nodes of the graph, and
that S is a subset of E. Hence, S is a Hamiltonian circuit, and thus (1.1) tests
if a graph is Hamiltonian.
It it well known that testing Hamiltonicity is an NP-complete problem. Is
this a coincidence, or is there a natural connection between NP and secondorder
logic? Let us turn our attention to two other well-known NP-complete
problems: 3-colorability and clique.
To test if a graph is 3-colorable, we have to check that there exist three
disjoint sets A, B, C covering the nodes of the graph such that for every edge
(a, b) ∈ E, the nodes a and b cannot belong to the same set. The sentence
below does precisely that:
∃A∃B∃C










∀x


(A(x) ∧ ¬B(x) ∧ ¬C(x))
∨ (¬A(x) ∧ B(x) ∧ ¬C(x))
∨ (¬A(x) ∧ ¬B(x) ∧ C(x))


∧
∀x, y E(x, y) → ¬


(A(x) ∧ A(y))
∨ (B(x) ∧ B(y))
∨ (C(x) ∧ C(y))












(1.2)
For clique, typically one has a parameter k, and the problem is to check
whether a clique of size k exists. Here, to stay purely within the formalism of
second-order logic, we assume that the input is a graph E and a set of nodes
(a unary relation) U, and we ask if E has a clique of size | U |. We do it by
testing if there is a set C (nodes of the clique) and a binary relation F that is
a one-to-one correspondence between C and U. Testing that the restriction of
E to C is a clique, and that F is one-to-one, can be done in ﬁrst-order logic.
Thus, the test is done by the following second-order sentence:
∃C∃F




∀x∀y F(x, y) → (C(x) ∧ U(y))
∧ ∀x C(x) → ∃!y(F(x, y) ∧ U(y))
∧ ∀y U(y) → ∃!x(F(x, y) ∧ C(x))
∧ ∀x∀y C(x) ∧ C(y) → E(x, y)



 (1.3)
Here ∃!xϕ(x) means “there exists exactly one x such that ϕ(x)”; this is
an abbreviation for ∃x ϕ(x) ∧ ∀y (ϕ(y) → x = y) .
Notice that (1.1), (1.2), and (1.3) all follow the same pattern: they start
with existential second-order quantiﬁers, followed by a ﬁrst-order formula.
Such formulas form what is called existential second-order logic, abbreviated
as ∃SO. The connection to NP can easily be seen: existential second-order
quantiﬁers correspond to the guessing stage of an NP algorithm, and the
remaining ﬁrst-order formula corresponds to the polynomial time veriﬁcation
stage of an NP algorithm.
6 1 Introduction
It turns out that the connection between NP and ∃SO is exact, as was
shown by Fagin in his celebrated 1974 theorem, stating that NP = ∃SO.
This connection opened up a new area, called descriptive complexity. The
goals of descriptive complexity are to describe complexity classes by means
of logical formalisms, and then use tools from mathematical logic to analyze
those classes. We shall prove Fagin’s theorem later, and we shall also see logical
characterizations of a number of other familiar complexity classes.
1.3 An Example from Formal Language Theory
Now we turn our attention to strings over a ﬁnite alphabet, say Σ = {a, b}.
We want to represent a string as a structure, much like a graph.
Given a string s = s1s2 . . . sn, we create a structure Ms as follows: the
universe is {1, . . . , n} (corresponding to positions in the string), we have one
binary relation < whose meaning of course is the usual order on the natural
numbers, and two unary relations A and B. Then A(i) is true if si = a, and
B(i) is true if si = b. For example, Mabba has universe {1, 2, 3, 4}, with A
interpreted as {1, 4} and B as {2, 3}.
Let us look at the following second-order sentence in which quantiﬁers
range over sets of positions in a string:
Φ ≡ ∃X∃Y



∀x X(x) ↔ ¬Y (x)
∧ ∀x ∀y (X(x) ∧ Y (y) → x < y)
∧ ∀x (X(x) → A(x) ∧ Y (x) → B(x))



When is Ms a model of Φ? This happens iﬀ there exists two sets of positions,
X and Y , such that X and Y form a partition of the universe (this is
what the ﬁrst conjunct says), that all positions in X precede the positions in
Y (that is what the second conjunct says), and that for each position i in X,
the ith symbol of s is a, for each position j in Y , the jth symbol is b (this
is stated in the third conjunct). That is, the string starts with some a’s, and
then switches to all b’s. Using the language of regular expressions, we can say
that
Ms |= Φ iﬀ s ∈ a∗
b∗
.
Is quantiﬁcation over sets really necessary in this example? It turns out
that the answer is no: one can express the fact that s is in a∗
b∗
by saying that
there are no two positions i < j such that the ith symbol is b and the jth
symbol is a. This, of course, can be done in ﬁrst-order logic:
¬∃i∃j (i < j) ∧ B(i) ∧ A(j) .
A natural question that arises then is the following: are second-order quantiﬁers
of no use if one wants to describe regular languages by logical means?
The answer is no, as we shall see later. For now, we can give an example.
1.3 An Example from Formal Language Theory 7
First, consider the sentence Φa ≡ ∀i A(i), which is true in Ms iﬀ s ∈ a∗
.
Next, deﬁne a relation i ≺ j saying that j is the successor of i. It can be
deﬁned by the formula (i < j) ∧ ∀k ((k ≤ i) ∨ (k ≥ j)) . Now consider the
sentence
Φ1 ≡ ∃X∃Y






∀i X(i) ↔ ¬Y (i)
∧ ∀i ¬∃j(j < i) → X(i)
∧ ∀i ¬∃j(j > i) → Y (i)
∧ ∀i∀j (i ≺ j) ∧ X(i) → Y (j)
∧ ∀i∀j (i ≺ j) ∧ Y (i) → X(j)






This sentence says that the universe {1, . . ., n} can be partitioned into two
sets X and Y such that 1 ∈ X, n ∈ Y , and the successor of an element of X
is in Y and vice versa; that is, the size of the universe is even.
Now what is Φ1 ∧ Φa? It says that the string is of even length, and has
only a’s in it – hence, Ms |= Φ1 ∧Φa iﬀ s ∈ (aa)∗
. It turns out that one cannot
deﬁne (aa)∗
using ﬁrst-order logic alone: one needs second-order quantiﬁers.
Moreover, with second-order quantiﬁers ranging over sets of positions, one
deﬁnes precisely the regular languages. We shall deal with both expressibility
and inexpressibility results related to logics over strings later in this book.
There are a number of common themes in the examples presented above.
In all the cases, we are talking about the expressive power of logics over ﬁnite
objects: relational databases, graphs, and strings. There is a close connection
between logical formalisms and familiar concepts from computer science: ﬁrstorder
logic corresponds to relational calculus, existential second-order logic to
the complexity class NP, and second-order logic with quantiﬁers ranging over
sets describes regular languages.
Of equal importance is the fact that in all the examples we want to show
some inexpressibility results. In the database example, we want to show that
the transitive closure is not expressible in ﬁrst-order logic. In the complexity
example, it would be nice to show that certain problems cannot be expressed
in ∃SO – any such result would give us bounds on the class NP, and this would
hopefully lead to separation results for complexity classes. In the example from
formal languages, we want to show that certain regular languages (e.g., (aa)∗
)
cannot be expressed in ﬁrst-order logic.
Inexpressibility results have traditionally been a core theme of ﬁnite model
theory. The main explanation for that is the source of motivating examples
for ﬁnite model theory. Most of them come from computer science, where one
is dealing not with natural phenomena, but rather with artiﬁcial creations.
Thus, we often want to know the limitations of these creations. In general,
this explains the popularity of impossibility results in computer science. After
all, the most famous open problem of computer science, the Ptime vs NP
problem, is so fascinating because the expected answer would tell us that a
large number of important problems cannot be solved eﬃciently.
8 1 Introduction
Concentrating on inexpressibility results highlights another important feature
of ﬁnite model theory: since we are often interested in counterexamples,
many constructions and techniques of interest apply only to a “small” fraction
of structures. In fact, we shall see that some techniques (e.g., locality)
degenerate to trivial statements on almost all structures, and yet it is that
small fraction of structures on which they behave interestingly that gives us
important techniques for analyzing expressiveness of logics, query languages,
etc. Towards the end of the book, we shall also see that on most typical structures,
some very expressive logics collapse to rather weak ones; however, all
interesting separation examples occur outside the class of “typical” structures.
1.4 An Overview of the Book
In Chap. 2, we review the background material from mathematical logic, computability
theory, and complexity theory.
In Chap. 3 we introduce the fundamental tool of Ehrenfeucht-Fra¨ıss´e
games, and prove their completeness for expressibility in ﬁrst-order logic (FO).
The game is played by two players, the spoiler and the duplicator, on two
structures. The spoiler tries to show that the structures are diﬀerent, while
the duplicator tries to show that they are the same. If the duplicator can
succeed for k rounds of such a game, it means that the structures cannot
be distinguished by FO sentences whose depth of quantiﬁer nesting does not
exceed k. We also deﬁne types, which play a very important role in many
aspects of ﬁnite model theory. In the same chapter, we see some bounds on
the expressive power of FO, proved via Ehrenfeucht-Fra¨ıss´e games.
Finding winning strategies in Ehrenfeucht-Fra¨ıss´e games becomes quite
hard for nontrivial structures. Thus, in Chap. 4, we introduce some suﬃcient
conditions that guarantee a win for the duplicator. These conditions are based
on the idea of locality. Intuitively, local formulae cannot see very far from their
free variables. We show several diﬀerent ways of formalizing this intuition,
and explain how each of those ways gives us easy proofs of bounds on the
expressiveness of FO.
In Chap. 5 we continue to study ﬁrst-order logic, but this time over structures
whose universe is ordered. Here we see the phenomenon that is very
common for logics over ﬁnite structures. We call a property of structures
order-invariant if it can be deﬁned with a linear order, but is independent
of a particular linear order used. It turns out that there are order-invariant
FO-deﬁnable properties that are not deﬁnable in FO alone. We also show that
such order-invariant properties continue to be local.
Chap. 6 deals with the complexity of FO. We distinguish two kinds of
complexity: data complexity, meaning that a formula is ﬁxed and the structure
on which it is evaluated varies, and combined complexity, meaning that both
the formula and the structure are part of the input. We show how to evaluate
1.4 An Overview of the Book 9
FO formulae by Boolean circuits, and use this to derive drastically diﬀerent
bounds for the complexity of FO: AC0
for data complexity, and Pspace for
combined complexity. We also consider the parametric complexity of FO: in
this case, the formula is viewed as a parameter of the input. Finally, we study
a subclass of FO queries, called conjunctive queries, which is very important
in database theory, and prove complexity bounds for it.
In Chap. 7, we move away from FO, and consider its extension with
monadic second-order quantiﬁers: such quantiﬁers can range over subsets of
the universe. The resulting logic is called monadic second-order logic, or MSO.
We also consider two restrictions of MSO: an ∃MSO formula starts with a sequence
of existential second-order quantiﬁers, which is followed by an FO
formula, and an ∀MSO formula starts with a sequence of universal secondorder
quantiﬁers, followed by an FO formula. We ﬁrst study ∃MSO and ∀MSO
on graphs, where they are shown to be diﬀerent. We then move to strings,
where MSO collapses to ∃MSO and captures precisely the regular languages.
Further restricting our attention to FO over strings, we prove that it captures
the star-free languages. We also cover MSO over trees, and tree automata.
In Chap. 8 we study a diﬀerent extension of FO: this time, we add mechanisms
for counting, such as counting terms, counting quantiﬁers, or certain
generalized unary quantiﬁers. We also introduce a logic that has a lot of
counting power, and prove that it remains local, much as FO. We apply these
results in the database setting, considering a standard feature of many query
languages – aggregate functions – and proving bounds on the expressiveness
of languages with aggregation.
In Chap. 9 we present the technique of coding Turing machines as ﬁnite
structures, and use it to prove two results: Trakhtenbrot’s theorem, which
says that the set of ﬁnitely satisﬁable sentences is not recursive, and Fagin’s
theorem, which says that NP problems are precisely those expressible in existential
second-order logic.
Chapter 10 deals with extensions of FO for expressing properties that,
algorithmically, require recursion. Such extensions have ﬁxed point operators.
There are three ﬂavors of them: least, inﬂationary, and partial ﬁxed point
operators. We study properties of resulting ﬁxed point logics, and prove that
in the presence of a linear order, they capture complexity classes Ptime (for
least and inﬂationary ﬁxed points) and Pspace (for partial ﬁxed points). We
also deal with a well-known database query language that adds ﬁxed points to
FO: Datalog. In the same chapter, we consider a closely related logic based
on adding the transitive closure operator to FO, and prove that over order
structures it captures nondeterministic logarithmic space.
Fixed point logics are not very easy to analyze. Nevertheless, they can be
embedded into a logic which uses inﬁnitary connectives, but has a restriction
that every formula only mentions ﬁnitely many variables. This logic, and
its fragments, are studied in Chap. 11. We introduce the logic Lω
∞ω, deﬁne
games for it, and prove that ﬁxed point logics are embeddable into it. We
10 1 Introduction
study deﬁnability of types for ﬁnite variable logics, and use them to provide
a purely logical counterpart of the Ptime vs. Pspace question.
In Chap. 12 we study the asymptotic behavior of FO and prove that every
FO sentence is either true in almost all structures, or false in almost all structures.
This phenomenon is known as the zero-one law. We also prove that
Lω
∞ω, and hence ﬁxed point logics, have the zero-one law. In the same chapter
we deﬁne an inﬁnite structure whose theory consists precisely of FO sentences
that hold in almost all structures. We also prove that almost everywhere, ﬁxed
point logics collapse to FO.
In Chap. 13, we show how ﬁnite and inﬁnite model theory mix: we look
at ﬁnite structures that live in an inﬁnite one, and study the power of FO
over such hybrid structures. We prove that for some underlying inﬁnite structures,
like N, +, · , every computable property of ﬁnite structures embedded
into them can be deﬁned, but for others, like R, +, · , one can only deﬁne
properties which are already expressible in FO over the ﬁnite structure alone.
We also explain connections between such mixed logics and database query
languages.
Finally, in Chap. 14, we outline other applications of ﬁnite model theory:
in decision problems in mathematical logic, in formal veriﬁcation of properties
of ﬁnite state systems, and in constraint satisfaction.
1.5 Exercises
Exercise 1.1. Show how to express the following properties of graphs in ﬁrst-order
logic:
• A graph is complete.
• A graph has an isolated vertex.
• A graph has at least two vertices of out-degree 3.
• Every vertex is connected by an edge to a vertex of out-degree 3.
Exercise 1.2. Show how to express the following properties of graphs in existential
second-order logic:
• A graph has a kernel, i.e., a set of vertices X such that there is no edge between
any two vertices in X, and every vertex outside of X is connected by an edge
to a vertex of X.
• A graph on n vertices has an independent set X (i.e., no two nodes in X are
connected by an edge) of size at least n/2.
• A graph has an even number of vertices.
• A graph has an even number of edges.
• A graph with m edges has a bipartite subgraph with at least m/2 edges.
Exercise 1.3. (a) Show how to deﬁne the following regular languages in monadic
second-order logic:
• a∗
(b + c)∗
aa∗
;
1.5 Exercises 11
• (aaa)∗
(bb)+
;
•
“`
(a + b)∗
cc∗
´∗
(aa)∗
”∗
a.
For the ﬁrst language, provide a ﬁrst-order deﬁnition as well.
(b) Let Φ be a monadic second-order logic sentence over strings. Show how to
construct a sentence Ψ such that Ms |= Ψ iﬀ there is a string s′
such that |s|=|s′
|
and Ms·s′ |= Φ. Here |s| refers to the length of s, and s · s′
is the concatenation of
s and s′
.
Remark: once we prove B¨uchi’s theorem in Chap. 7, you will see that the above
statement says that if L is a regular language, then the language
1
2
L = {s | for some s′
, |s|=|s′
| and s · s′
∈ L}
is regular too (see, e.g., Exercise 3.16 in Hopcroft and Ullman [126]).
2
Preliminaries
The goal of this chapter is to provide the necessary background from mathematical
logic, formal languages, and complexity theory.
2.1 Background from Mathematical Logic
We now brieﬂy review some standard deﬁnitions from mathematical logic.
Deﬁnition 2.1. A vocabulary σ is a collection of constant symbols (denoted
c1, . . . , cn, . . . ), relation, or predicate, symbols (P1, . . . , Pn, . . . ) and function
symbols (f1, . . . , fn, . . . ). Each relation and function symbol has an associated
arity.
A σ-structure (also called a model)
A = A, {cA
i }, {PA
i }, {fA
i }
consists of a universe A together with an interpretation of
• each constant symbol ci from σ as an element cA
i ∈ A;
• each k-ary relation symbol Pi from σ as a k-ary relation on A; that is, a
set PA
i ⊆ Ak
; and
• each k-ary function symbol fi from σ as a function fA
i : Ak
→ A.
A structure A is called ﬁnite if its universe A is a ﬁnite set. The universe of
a structure is typically denoted by a Roman letter corresponding to the name
of the structure; that is, the universe of A is A, the universe of B is B, and
so on. We shall also occasionally write x ∈ A instead of x ∈ A.
For example, if σ has constant symbols 0, 1, a binary relation symbol <,
and two binary function symbols · and +, then one possible structure for σ
is the real ﬁeld R = R, 0R
, 1R
, <R
, +R
, ·R
, where 0R
, 1R
, <R
, +R
, ·R
have
14 2 Preliminaries
the expected meaning. Quite often – in fact, typically – we shall omit the
superscript with the name of the structure, using the same symbol for both a
symbol in the vocabulary, and its interpretation in a structure. For example,
we shall write R = R, 0, 1, <, +, · for the real ﬁeld.
A few notes on restrictions on vocabularies are in order. Constants can be
treated as functions of arity zero; however, we often need them separately, as
in the ﬁnite case, we typically restrict vocabularies to relational ones: such vocabularies
contain only relation symbols and constants. This is not a serious
restriction, as ﬁrst-order logic deﬁnes, for each k-ary function f, its graph,
which is a (k + 1)-ary relation {(x, f(x)) | x ∈ Ak
}. A vocabulary that consists
exclusively of relation symbols (i.e., does not have constant and function
symbols) is called purely relational.
Unless stated explicitly otherwise, we shall assume that:
• any vocabulary σ is at most countable;
• when we deal with ﬁnite structures, vocabularies σ are ﬁnite and rela-
tional.
If σ is a relational vocabulary, then STRUCT[σ] denotes the class of all ﬁnite
σ-structures.
Next, we deﬁne ﬁrst-order (FO) formulae, free and bound variables, and
the semantics of FO formulae.
Deﬁnition 2.2. We assume a countably inﬁnite set of variables. Variables
will be typically denoted by x, y, z, . . ., with subscripts and superscripts. We
inductively deﬁne terms and formulae of the ﬁrst-order predicate calculus
over vocabulary σ as follows:
• Each variable x is a term.
• Each constant symbol c is a term.
• If t1, . . . , tk are terms and f is a k-ary function symbol, then f(t1, . . . , tk)
is a term.
• If t1, t2 are terms, then t1 = t2 is an (atomic) formula.
• If t1, . . . , tk are terms and P is a k-ary relation symbol, then P(t1, . . . , tk)
is an (atomic) formula.
• If ϕ1, ϕ2 are formulae, then ϕ1 ∧ ϕ2, ϕ1 ∨ ϕ2, and ¬ϕ1 are formulae.
• If ϕ is a formula, then ∃xϕ and ∀xϕ are formulae.
A formula that does not use existential (∃) and universal (∀) quantiﬁers
is called quantiﬁer-free.
We shall use the standard shorthand ϕ → ψ for ¬ϕ ∨ ψ and ϕ ↔ ψ for
(ϕ → ψ) ∧ (ψ → ϕ).
Free variables of a formula or a term are deﬁned as follows:
2.1 Background from Mathematical Logic 15
• The only free variable of a term x is x; a constant term c does not have
free variables.
• Free variables of t1 = t2 are the free variables of t1 and t2; free variables
of P(t1, . . . , tk) or f(t1, . . . , tk) are the free variables of t1, . . . , tk.
• Negation (¬) does not change the list of free variables; the free variables
of ϕ1 ∨ ϕ2 (and of ϕ1 ∧ ϕ2) are the free variables of ϕ1 and ϕ2.
• Free variables of ∀xϕ and ∃xϕ are the free variables of ϕ except x.
Variables that are not free are called bound.
If x is the tuple of all the free variables of ϕ, we write ϕ(x). A sentence
is a formula without free variables. We often use capital Greek letters for
sentences.
Given a set of formulae S, formulae constructed from formulae in S using
only the Boolean connectives ∨, ∧, and ¬ are called Boolean combinations of
formulae in S.
Given a σ-structure A, we deﬁne inductively for each term t with free
variables (x1, . . . , xn) the value tA
(a), where a ∈ An
, and for each formula
ϕ(x1, . . . , xn), the notion of A |= ϕ(a) (i.e., ϕ(a) is true in A).
• If t is a constant symbol c, then the value of t in A is cA
.
• If t is a variable xi, then the value of tA
(a) is ai.
• If t = f(t1, . . . , tk), then the value of tA
(a) is fA
(tA
1 (a), . . . , tA
k (a)).
• If ϕ ≡ (t1 = t2), then A |= ϕ(a) iﬀ tA
1 (a) = tA
2 (a).
• If ϕ ≡ P(t1, . . . , tk), then A |= ϕ(a) iﬀ (tA
1 (a), . . . , tA
k (a)) ∈ PA
.
• A |= ¬ϕ(a) iﬀ A |= ϕ(a) does not hold.
• A |= ϕ1(a) ∧ ϕ2(a) iﬀ A |= ϕ1(a) and A |= ϕ2(a).
• A |= ϕ1(a) ∨ ϕ2(a) iﬀ A |= ϕ1(a) or A |= ϕ2(a).
• If ψ(x) ≡ ∃yϕ(y, x), then A |= ψ(a) iﬀ A |= ϕ(a′
, a) for some a′
∈ A.
• If ψ(x) ≡ ∀yϕ(y, x), then A |= ψ(a) iﬀ A |= ϕ(a′
, a) for all a′
∈ A.
If A ∈ STRUCT[σ] and A0 ⊆ A, the substructure of A generated by A0 is
a σ-structure B whose universe is B = A0 ∪ {cA
| c a constant symbol in σ},
with cB
= cA
for every c, and with each k-ary relation R interpreted as the
restriction of RA
to B: that is, RB
= RA
∩ Bk
.
Let σ′
be a vocabulary disjoint from σ. Let A be a σ-structure, and let A′
be a σ′
-structure with the same universe A. We then write (A, A′
) for a σ∪σ′
structure
on A in which all constant and relation symbols in σ are interpreted
as in A, and all constant and relation symbols in σ′
are interpreted as in A′
.
One of the most common instances of such an expansion is when σ′
only
contains constant symbols; in this case, the expansion allows us to go back and
16 2 Preliminaries
forth between formulae and sentences, which will be very convenient when we
talk about games and expressiveness of formulas as well as sentences.
From now on, we shall use the notation σn for the expansion of vocabulary
σ with n new constant symbols c1, . . . , cn.
Let ϕ(x1, . . . , xn) be a formula in vocabulary σ. Consider a σn sentence Φ
obtained from ϕ by replacing each xi with ci, i ≤ n. Let (a1, . . . , an) ∈ An
.
Then one can easily show (the proof is left as an exercise) the following:
Lemma 2.3. A |= ϕ(a1, . . . , an) iﬀ (A, a1, . . . , an) |= Φ.
This correspondence is rather convenient: we often do not need separate
treatment for sentences and formulae with free variables.
Most classical theorems from model theory fail in the ﬁnite case, as will be
seen later. However, two fundamental facts – compactness and the L¨owenheimSkolem
theorem – will be used to prove results about ﬁnite models. To state
them, we need the following deﬁnition.
Deﬁnition 2.4. A theory (over σ) is a set of sentences. A σ-structure A is a
model of a theory T iﬀ for every sentence Φ of T , the structure A is a model
of Φ; that is, A |= Φ. A theory T is called consistent if it has a model.
Theorem 2.5 (Compactness). A theory T is consistent iﬀ every ﬁnite subset
of T is consistent.
Theorem 2.6 (L¨owenheim-Skolem). If T has an inﬁnite model, then it
has a countable model.
In general, Theorem 2.1 allows one to construct a model of cardinality
max{ω, |σ|}, but we shall never deal with uncountable vocabularies here.
Compactness follows from the completeness theorem, stating that T |= ϕ
iﬀ T ⊢ ϕ, where ⊢ refers to a derivation in a formal proof system. We shall
see some other important corollaries of this result.
We say that a sentence Φ is satisﬁable if it has a model, and it is valid if it
is true in every structure. These notions are closely related: Φ is not valid iﬀ
¬Φ is satisﬁable. It follows from completeness that the set of valid sentences is
recursively enumerable (if you forgot the deﬁnition of recursively enumerable,
it is given in the next section). This is true when one considers validity with
respect to arbitrary models; we shall see later that validity over ﬁnite models
in not recursively enumerable.
Given two structures A and B of a relational vocabulary σ, a homomorphism
between them is a mapping h : A → B such that for each constant
symbol c in σ, we have h(cA
) = cB
, and for each k-ary relation symbol R and
a tuple (a1, . . . , ak) ∈ RA
, the tuple (h(a1), . . . , h(ak)) is in RB
. A bijective
2.2 Background from Automata and Computability Theory 17
homomorphism h whose inverse is also a homomorphism is called an isomorphism.
If there is an isomorphism between two structures A and B, we say
that they are isomorphic, and we write A ∼= B.
Next, we need the following basic deﬁnition of m-ary queries.
Deﬁnition 2.7 (Queries). An m-ary query, m ≥ 0, on σ-structures, is a
mapping Q that associates with each structure A a subset of Am
, such that
Q is closed under isomorphism: if A ∼= B via isomorphism h : A → B, then
Q(B) = h(Q(A)).
We say that Q is deﬁnable in a logic L if there is a formula ϕ(x1, . . . , xm)
of L in vocabulary σ such that for every A,
Q(A) = {(a1, . . . , am) ∈ Am
| A |= ϕ(a1, . . . , am)}.
If Q is deﬁnable by ϕ, we shall also write ϕ(A) instead of Q(A). Furthermore,
for a formula ϕ(x, y), we write ϕ(A, b) for {a ∈ A|a|
| A |= ϕ(a, b)}.
A very important special case is that of m = 0. We assume that A0
is a
one-element set, and there are only two subsets of A0
. Hence, a 0-ary query is
just a mapping from σ-structures to a two-element set, which can be assumed
to contain true and false. Such queries will be called Boolean. A Boolean query
can be associated with a subset C ⊆ STRUCT[σ] closed under isomorphism:
A ∈ C iﬀ Q(A) = true.
Such a query Q is deﬁnable in a logic L if there is an L-sentence Φ such that
Q(A) = true iﬀ A |= Φ.
An example of a binary (m = 2) query is the transitive closure of a graph.
An example of a unary (m = 1) query is the set of all isolated nodes in a
graph. An example of a Boolean (m = 0) query on graphs is planarity.
2.2 Background from Automata and Computability
Theory
In this section we brieﬂy review some basic concepts of ﬁnite automata and
computability theory.
Let Σ be a ﬁnite nonempty alphabet; that is, a ﬁnite set of symbols. The
set of all ﬁnite strings over Σ will be denoted by Σ∗
. We shall use s · s′
to
denote concatenation of two strings s and s′
. The empty string is denoted by
ǫ. One commonly refers to subsets of Σ∗
as languages.
A nondeterministic ﬁnite automaton is a tuple A = (Q, Σ, q0, F, δ) where
Q is a ﬁnite set of states, Σ is a ﬁnite alphabet, q0 ∈ Q is the initial state,
F ⊆ Q is the set of ﬁnal states, and δ : Q×Σ → 2Q
is the transition function.
An automaton is deterministic if |δ(q, a)| = 1 for every q and a; that is, if δ
can be viewed as a function Q × Σ → Q.
18 2 Preliminaries
Let s = a1a2 . . . an be a string in Σ∗
. Deﬁne a run of A on s as a mapping
r : {1, . . . , n} → Q such that
• r(1) ∈ δ(q0, a1) (or r(1) = δ(q0, a1) if A is deterministic), and
• r(i + 1) ∈ δ(r(i), ai+1) (or r(i + 1) = δ(r(i), ai+1) if A is deterministic).
We say that a run is accepting if r(n) ∈ F, and that A accepts s if there is
an accepting run (for the case of a deterministic automaton, there is exactly
one run for each string). The set of all strings accepted by A is denoted by
L(A).
A language L is called regular if there is a nondeterministic ﬁnite automaton
A such that L = L(A). It is well known that for any regular language L,
one can ﬁnd a deterministic ﬁnite automaton A such that L = L(A).
Turing machines are the most general computing devices. Formally, a Turing
machine M is a tuple (Q, Σ, ∆, δ, q0, Qa, Qr), where
• Q is a ﬁnite set of states;
• Σ is a ﬁnite input alphabet;
• ∆ is a ﬁnite tape alphabet; it contains Σ and a designated blank symbol
’ ’;
• δ : Q × ∆ → 2Q×∆×{ℓ,r}
is the transition function;
• q0 ∈ Q is the initial state;
• Qa and Qr are the sets of accepting and rejecting states respectively; we
require that Qa ∩ Qr = ∅. We refer to states in Qa ∪ Qr as the halting
states.
A Turing machine is called deterministic if |δ(q, a)|= 1 for every q, a; that is,
if δ can be viewed as a function Q × ∆ → Q × ∆ × {ℓ, r}.
We assume that Turing machines have a one-way inﬁnite tape, and one
head. A conﬁguration of a Turing machine M speciﬁes the contents of the
tape, the state, and the position of the head as follows. Let the tape contain
symbols w1, w2, . . ., where wi ∈ ∆ is the symbol in the ith position of the
tape. Assume that the head is in position j, and n ≥ j is such that for
all n′
> n, wn′ = (the blank symbol). If M is in state q, we denote this
conﬁguration by w1w2 . . . wj−1qwj . . . wn. We deﬁne the relation C ⊢δ C′
as
follows. If C = s · q · a · s′
, where s, s′
∈ ∆∗
, a ∈ ∆, and q ∈ Qa ∪ Qr, then
• if (q′
, b, ℓ) ∈ δ(q, a), then C ⊢δ s0 · q′
· c · b · s′
, where s = s0 · c (that is, a is
replaced by b, the new state is q′
, and the head moves left; if s = ǫ, then
C ⊢δ q′
· b · s′
), and
• if (q′
, b, r) ∈ δ(q, a), then C ⊢δ s · b · q′
· s′
(that is, a is replaced by b, the
new state is q′
, and the head moves right).
2.3 Background from Complexity Theory 19
A conﬁguration s · q · s′
is accepting if q ∈ Qa, and rejecting if q ∈ Qr.
Suppose we have a string s ∈ Σ∗
. The initial conﬁguration C(s) corresponding
to this string is q0 · s; that is, the state is q0, the head points to the
ﬁrst position of s, and the tape contains s followed by blanks. We say that s
is accepted by M if there is a sequence of conﬁgurations C0, C1, . . . , Cn such
that C0 = C(s), Ci ⊢δ Ci+1, i < n, and Cn is an accepting conﬁguration. The
set of all strings accepted by M is denoted by L(M).
We call a subset L of Σ∗
recursively enumerable, or r.e. for short, if there
is a Turing machine M such that L = L(M).
Notice that in general, there are three possibilities for computations by a
Turing machine M on input s: M accepts s, or M eventually enters a rejecting
state, or M loops; that is, it never enters a halting state. We call a Turing
machine halting if the last outcome is impossible. In other words, on every
input, M eventually enters a halting state.
We call a subset L of Σ∗
recursive if there is a halting Turing machine
M such that L = L(M). Halting Turing machines can be seen as deciders for
some sets L: for every string s, M eventually enters either an accepting or a
rejecting state, which decides whether s ∈ L. For that reason, one sometimes
uses decidable instead of recursive. When we speak of decidable problems, we
mean that a suitable encoding of the problem as a subset of Σ∗
for some ﬁnite
Σ is decidable.
A canonical example of an undecidable problem is the halting problem:
given a Turing machine M and an input w, does M halt on w (i.e., eventually
enters a halting state)? In general, any nontrivial property of recursively
enumerable sets is undecidable. One result we shall use later is that it is
undecidable whether a given Turing machine halts on the empty input.
2.3 Background from Complexity Theory
Let L be a language accepted by a halting Turing machine M. Assume that
for some function f : N → N, it is the case that the number of transitions M
makes before accepting or rejecting a string s is at most f(|s|), where |s| is
the length of s. If M is deterministic, then we write L ∈ DTIME(f); if M is
nondeterministic, then we write L ∈ NTIME(f).
We deﬁne the class Ptime of polynomial-time computable problems as
Ptime =
k∈N
DTIME(nk
),
and the class NP of problems computable by nondeterministic polynomialtime
Turing machines as
NP =
k∈N
NTIME(nk
).
20 2 Preliminaries
The class coNP is deﬁned as the class of languages whose complements are
in NP. Notice that Ptime is closed under complementation, but this is not
clear in the case of NP. We have Ptime ⊆ NP ∩ coNP, but it is not known
whether the containment is proper, and whether NP equals coNP.
Now assume that f(n) ≥ n for all n ∈ N. Deﬁne DSPACE(f) as the class
of languages L that are accepted by deterministic halting Turing machines
M such that for every string s, the length of the longest conﬁguration of M
that occurs during the computation on s is at most f(|s|). In other words, M
does not use more than f(|s|) cells of the tape. Similarly, we deﬁne the class
NSPACE(f) by using nondeterministic machines. We then let
Pspace =
k∈N
DSPACE(nk
).
In the case of space complexity, the nondeterministic case collapses to the
deterministic one: by Savitch’s theorem, Pspace = k∈N NSPACE(nk
).
To deﬁne space complexity for sublinear functions f, we use a model of
Turing machines with a work tape. In such a model, a machine M has two
tapes, and two heads. The ﬁrst tape is the input tape: it stores the input, and
the machine cannot write on it (but can move the head). The second tape is the
work tape, which operates as the normal tape of a Turing machine. We deﬁne
the class NLog as the class of languages accepted by such nondeterministic
machines where the size of the work tape does not exceed O(log |s|), on the
input s. Likewise, we deﬁne the class DLog as the class of language accepted
by deterministic machines with the work tape, where at most O(log |s|) cells
of the work tape are used.
Finally, we deﬁne the polynomial hierarchy PH. Let Σp
0 = Πp
0 = Ptime.
Deﬁne inductively Σp
i = NPΣp
i−1 , for i ≥ 1. That is, languages in Σp
i are those
accepted by a nondeterministic Turing machine running in polynomial time
such that this machine can make “calls” to another machine that computes
a language in Σp
i−1. Such a call is assumed to have unit cost. We deﬁne the
class Πp
i as the class of languages whose complements are in Σp
i .
Notice that Σp
1 = NP and Πp
1 = coNP. We deﬁne the polynomial hierarchy
as
PH =
i∈N
Σp
i =
i∈N
Πp
i .
This will be suﬃcient for our purposes, but there is another interesting deﬁnition
of PH in terms of alternating Turing machines.
The relationship between the complexity classes we introduced is as fol-
lows:
DLog ⊆ NLog ⊆ Ptime ⊆
NP
coNP
⊆ PH ⊆ Pspace.
2.4 Bibliographic Notes 21
None of the containments of any two consecutive classes in this sequence is
known to be proper, although it is known that NLog Pspace.
We shall also refer to two classes based on exponential running time. These
are
Exptime =
k∈N
DTIME(2nk
) and Nexptime =
k∈N
NTIME(2nk
).
Both of these contain Pspace.
Later in the book we shall see a number of other complexity classes, in
particular circuit-based classes AC0
and TC0
(which are both contained in
DLog).
2.4 Bibliographic Notes
Standard mathematical logic texts are Enderton [66], Ebbinghaus, Flum, and
Thomas [61], and van Dalen [241]; inﬁnite model theory is the subject of
Chang and Keisler [35], Hodges [125], and Poizat [201]. Good references on
complexity theory are Papadimitriou [195], Johnson [139], and Du and Ko
[59]. For the basics on automata and computability, see Hopcroft and Ullman
[126], Khoussainov and Nerode [145], and Sipser [221].
3
Ehrenfeucht-Fra¨ıss´e Games
We start this chapter by giving a few examples of inexpressibility proofs,
using the standard model-theoretic machinery (compactness, the L¨owenheimSkolem
theorem). We then show that this machinery is not generally applicable
in the ﬁnite model theory context, and introduce the notion of
Ehrenfeucht-Fra¨ıss´e games for ﬁrst-order logic. We prove the EhrenfeuchtFra¨ıss´e
theorem, characterizing the expressive power of FO via games, and
introduce the notion of types, which will be central throughout the book.
3.1 First Inexpressibility Proofs
How can one prove that a certain property is inexpressible in FO? Certainly
logicians must have invented tools for proving such results, and we shall now
see a few examples. The problem is that these tools are not particularly well
suited to the ﬁnite context, so in the next section, we introduce a diﬀerent
technique that will be used for FO and other logics over ﬁnite models.
In the ﬁrst example, we deal with connectivity: given a graph G, is it
connected? Recall that a graph with an edge relation E is connected if for
every two nodes a, b one can ﬁnd a number n and nodes c1, . . . , cn ∈ V such
that (a, c1), (c1, c2), . . . , (cn, b) are all edges in the graph. A standard modeltheoretic
argument below shows that connectivity is not FO-deﬁnable.
Proposition 3.1. Connectivity of arbitrary graphs is not FO-deﬁnable.
Proof. Assume that connectivity is deﬁnable by a sentence Φ, over vocabulary
σ = {E}. Let σ2 expand σ with two constant symbols, c1 and c2. For every
n, let Ψn be the sentence
¬ ∃x1 . . . ∃xn (E(c1, x1) ∧ E(x1, x2) ∧ . . . ∧ E(xn, c2)) ,
saying that there is no path of length n + 1 from c1 to c2.
Let T be the theory
24 3 Ehrenfeucht-Fra¨ıss´e Games
{Ψn | n > 0} ∪ {¬(c1 = c2), ¬E(c1, c2)} ∪ {Φ}.
We claim that T is consistent. By compactness, we have to show that every
ﬁnite subset T ′
⊆ T is consistent. Indeed, let N be such that for all Ψn ∈ T ′
,
n < N. Then a connected graph in which the shortest path from c1 to c2 has
length N + 1 is a model of T ′
.
Since T is consistent, it has a model. Let G be a model of T . Then G is
connected, but there is no path from c1 to c2 of length n, for any n. This
contradiction shows that connectivity is not FO-deﬁnable.
Does the proof above tell us that FO, or relational calculus, cannot express
the connectivity test over ﬁnite graphs? Unfortunately, it does not. While
connectivity is not deﬁnable in FO over arbitrary graphs, the proof above
leaves open the possibility that there is a ﬁrst-order sentence that correctly
tests connectivity only for ﬁnite graphs. But to prove the desired result for
relational calculus, one has to show inexpressibility of connectivity over ﬁnite
graphs.
Can one modify the proof above for ﬁnite models? An obvious way to do
so would be to use compactness over ﬁnite graphs (i.e., if every ﬁnite subset
of T has a ﬁnite model, then T has a ﬁnite model), assuming this holds.
Unfortunately, this turns out not to be the case.
Proposition 3.2. Compactness fails over ﬁnite models: there is a theory T
such that
1. T has no ﬁnite models, and
2. every ﬁnite subset of T has a ﬁnite model.
Proof. We assume that σ = ∅, and deﬁne λn as a sentence stating that the
universe has at least n distinct elements:
λn ≡ ∃x1 . . . ∃xn
i=j
¬(xi = xj). (3.1)
Now T = {λn|n ≥ 0}. Clearly, T has no ﬁnite model, but for each ﬁnite subset
{λn1 , . . . , λnk
} of T , a set whose cardinality exceeds all the ni’s is a model.
However, sometimes a compactness argument works nicely in the ﬁnite
context. We now consider a very important property, which will be seen many
times in this book. We want to test if the cardinality of the universe is even.
That is, we are interested in query even deﬁned as
even(A) = true iﬀ |A| mod 2 = 0.
Note that this only makes sense over ﬁnite models; for inﬁnite A the value of
even could be arbitrary.
3.1 First Inexpressibility Proofs 25
Proposition 3.3. Assume that σ = ∅. Then even is not FO-deﬁnable.
Proof. Suppose even is deﬁnable by a sentence Φ. Consider sentences λn (3.1)
from the proof of Proposition 3.2 and two theories:
T1 = {Φ} ∪ {λk | k > 0}, T2 = {¬Φ} ∪ {λk | k > 0}.
By compactness, both are consistent. These theories only have inﬁnite
models, so by the L¨owenheim-Skolem theorem, both have countable models,
A1 and A2. Since σ = ∅, the structures A1 and A2 are just countable sets, and
hence isomorphic. Thus, we have two isomorphic models, A1 and A2, with
A1 |= Φ and A2 |= ¬Φ. This contradiction proves the result.
This is nice, but there is a small problem: we assumed that the vocabulary
is empty. But what if we have, for example, σ = {<}, and we want to prove
that evenness of ordered sets is not deﬁnable? In this case we would expand T1
and T2 with axioms of ordered sets, and we would obtain, by compactness and
L¨owenheim-Skolem, two countable linear orderings A1 and A2, one a model of
Φ, the other a model of ¬Φ. This is a dead end, since two arbitrary countable
linear orders need not be isomorphic (in fact, some can be distinguished by
ﬁrst-order sentences: think, for example, of a discrete order like N, < and a
dense one like Q, < ).
Thus, while traditional tools from model theory may help us prove some
results, they are often not suﬃcient for proving results about ﬁnite models. We
shall examine, in subsequent chapters, tools designed for proving expressivity
bounds in the ﬁnite case.
As an introduction to these tools, let us revisit the proof of Proposition
3.3. In the proof, we constructed two models, A1 and A2, that agree on all
FO sentences (since they are isomorphic), and yet compactness tells us that
they disagree on Φ, which was assumed to deﬁne even – hence even is not
ﬁrst-order.
Can we extend this technique to prove inexpressibility results over ﬁnite
models? The most straightforward attempt to do so fails due to the following.
Lemma 3.4. For every ﬁnite structure A, there is a sentence ΦA such that
B |= ΦA iﬀ B ∼= A.
Proof. Assume without loss of generality that A is a graph: σ = {E}. Let
A = {a1, . . . , an}, E . Deﬁne ΦA as
∃x1 . . . ∃xn



i=j
¬(xi = xj) ∧ ∀y
i
y = xi
∧
(ai,aj)∈E
E(xi, xj) ∧
(ai,aj )∈E
¬E(xi, xj)


 .
Then B |= ΦA iﬀ B ∼= A.
26 3 Ehrenfeucht-Fra¨ıss´e Games
In particular, every two ﬁnite structures that agree on all FO sentences
are isomorphic, and hence agree on any Boolean query (as Boolean queries
are closed under isomorphism).
The idea that is prevalent in inexpressibility proofs in ﬁnite model theory
is, nevertheless, very close to the original idea of ﬁnding structures A
and B that agree on all FO sentences but disagree on a given query. But
instead of two structures, A and B, we consider two families of structures,
{Ak | k ∈ N} and {Bk | k ∈ N}, and instead of all FO sentences, we consider
a certain partition of FO sentences into inﬁnitely many classes.
In general, the methodology is as follows. Suppose we want to prove that
a property P is not expressible in a logic L. We then partition the set of all
sentences of L into countably many classes, L[0], L[1], . . . , L[k], . . . (we shall
see in Sect. 3.3 how to do it), and ﬁnd two families of structures, {Ak | k ∈ N}
and {Bk | k ∈ N}, such that
• Ak |= Φ iﬀ Bk |= Φ for every L[k] sentence Φ; and
• Ak has property P, but Bk does not.
Clearly, this would show P ∈ L; it “only” remains to show what L[k] is,
and give techniques that help us prove that two structures agree on L[k]. We
shall do precisely that in the rest of the chapter, for the case of L = FO, and
later for other logics.
3.2 Deﬁnition and Examples of Ehrenfeucht-Fra¨ıss´e
Games
Ehrenfeucht-Fra¨ıss´e games give us a nice tool for describing expressiveness of
logics over ﬁnite models. In general, games are applicable for both ﬁnite and
inﬁnite models (at least for FO), but we have seen that in the inﬁnite case we
have a number of more powerful tools. In fact, in some model theory texts,
Ehrenfeucht-Fra¨ıss´e games are only brieﬂy mentioned (or even appear only
as exercises), but in the ﬁnite case, their applicability makes them a central
notion.
The idea of the game – for FO and other logics as well – is almost invariably
the same. There are two players, called the spoiler and the duplicator (or, less
imaginatively, player I and player II). The board of the game consists of two
structures, say A and B. The goal of the spoiler is to show that these two
structures are diﬀerent; the goal of the duplicator is to show that they are the
same.
In the classical Ehrenfeucht-Fra¨ıss´e game, the players play a certain number
of rounds. Each round consists of the following steps:
1. The spoiler picks a structure (A or B).
3.2 Deﬁnition and Examples of Ehrenfeucht-Fra¨ıss´e Games 27
2. The spoiler makes a move by picking an element of that structure: either
a ∈ A or b ∈ B.
3. The duplicator responds by picking an element in the other structure.
An illustration is given in Fig. 3.1. The spoiler’s moves are shown as ﬁlled
circles, and the duplicator’s moves as empty circles. In the ﬁrst round, the
spoiler picks B and selects b1 ∈ B; the duplicator responds by a1 ∈ A. In the
next round, the spoiler changes structures and picks a2 ∈ A; the duplicator
responds by b2 ∈ B. In the third round the spoiler plays b3 ∈ B; the response
of the duplicator is a3 ∈ A.
Since there is a game, someone must win it. To deﬁne the winning condition
we need a crucial deﬁnition of a partial isomorphism. Recall that all ﬁnite
structures have a relational vocabulary (no function symbols).
Deﬁnition 3.5 (Partial isomorphism). Let A, B be two σ-structures,
where σ is relational, and a = (a1, . . . , an) and b = (b1, . . . , bn) two tuples
in A and B respectively. Then (a, b) deﬁnes a partial isomorphism between A
and B if the following conditions hold:
• For every i, j ≤ n,
ai = aj iﬀ bi = bj.
• For every constant symbol c from σ, and every i ≤ n,
ai = cA
iﬀ bi = cB
.
• For every k-ary relation symbol P from σ and every sequence (i1, . . . , ik)
of (not necessarily distinct) numbers from [1, n],
(ai1 , . . . , aik
) ∈ PA
iﬀ (bi1 , . . . , bik
) ∈ PB
.
In the absence of constant symbols, this deﬁnition says that the mapping
ai → bi, i ≤ n, is an isomorphism between the substructures of A and B
generated by {a1, . . . , an} and {b1, . . . , bn}, respectively.
After n rounds of an Ehrenfeucht-Fra¨ıss´e game, we have moves (a1, . . . , an)
and (b1, . . . , bn). Let c1, . . . , cl be the constant symbols in σ; then c A
denotes
(cA
1 , . . . , cA
l ) and likewise for c B
. We say that (a, b) is a winning position for
the duplicator if
((a, c A
), (b, c B
))
is a partial isomorphism between A and B. In other words, the map that
sends each ai into bi and each cA
j into cB
j is an isomorphism between
the substructures of A and B generated by {a1, . . . , an, cA
1 , . . . , cA
l } and
{b1, . . . , bn, cB
1 , . . . , cB
l } respectively.
We say that the duplicator has an n-round winning strategy in the
Ehrenfeucht-Fra¨ıss´e game on A and B if the duplicator can play in a way
28 3 Ehrenfeucht-Fra¨ıss´e Games
 
-
 A
B
a1
a2
b1
b2
b3
c A
c B
a3
Fig. 3.1. Ehrenfeucht-Fra¨ıss´e game
that guarantees a winning position after n rounds, no matter how the spoiler
plays. Otherwise, the spoiler has an n-round winning strategy. If the duplicator
has an n-round winning strategy, we write A ≡n B.
Observe that A ≡n B implies A ≡k B for every k ≤ n.
Before we connect Ehrenfeucht-Fra¨ıss´e games and FO-deﬁnability, we give
some examples of winning strategies.
Games on Sets
In this example, the vocabulary σ is empty. That is, a structure is just a set.
Let |A|, |B| ≥ n. Then A ≡n B.
The strategy for the duplicator works as follows. Suppose i rounds have
been played, and the position is ((a1, . . . , ai), (b1, . . . , bi)). Assume the spoiler
picks an element ai+1 ∈ A. If ai+1 = aj for j ≤ i, then the duplicator
responds with bi+1 = bj; otherwise, the duplicator responds with any bj+1 ∈
B − {b1, . . . , bi} (which exists since |B |≥ n).
Games on Linear Orders
Our next example is a bit more complicated, as we add a binary relation <
to σ, to be interpreted as a linear order. Now suppose L1, L2 are two linear
orders of size at least n (i.e., structures of the form {1, . . . , m}, < , m ≥ n).
Is it true that L1 ≡n L2?
It is very easy to see that the answer is negative even for the case of n = 2.
Let L1 contain three elements (say {1, 2, 3}), and L2 two elements ({1, 2}). In
the ﬁrst move, the spoiler plays 2 in L1. The duplicator has to respond with
either 1 or 2 in L2. Suppose the duplicator responds with 1 ∈ L2; then the
spoiler plays 1 ∈ L1 and the duplicator is lost, since he has to respond with
an element less than 1 in L1, and there is no such element. If the duplicator
selects 2 ∈ L2 as his ﬁrst-round move, the spoiler plays 3 ∈ L1, and the
duplicator is lost again. Hence, L1 ≡2 L2.
However, a winning strategy for the duplicator can be guaranteed if L1, L2
are much larger than the number of rounds.
3.2 Deﬁnition and Examples of Ehrenfeucht-Fra¨ıss´e Games 29
aj ai+1 al
bj bi+1 bl
2k−i
2k−(i+1) 2k−(i+1)
2k−i
2k−(i+1) 2k−(i+1)
d
d
2k−i
2k−(i+1)
2k−i
2k−(i+1)
aj ai+1 al
bj bi+1 bl
(a) d < 2k−(i+1) (b)
L1
L2
Fig. 3.2. Illustration for the proof of Theorem 3.6
Theorem 3.6. Let k > 0, and let L1, L2 be linear orders of length at least 2k
.
Then L1 ≡k L2.
We shall give two diﬀerent proofs of this result that illustrate two diﬀerent
techniques often used in game proofs.
Theorem 3.6, Proof # 1. The idea of the ﬁrst proof is as follows. We use
induction on the number of rounds of the game, and our induction hypothesis
is stronger than just the partial isomorphism claim. The reason is that if we
simply state that after i rounds we have a partial isomorphism, the induction
step will not get oﬀ the ground as there are too few assumptions. Hence, we
have to make additional assumptions. But if we try to impose too many conditions,
there is no guarantee that a game can proceed in a way that preserves
them. The main challenge in proofs of this type is to ﬁnd the right induction
hypothesis: the one that is strong enough to imply partial isomorphism, and
that has enough conditions to make the inductive proof possible.
We now illustrate this general principle by proving Theorem 3.6. We expand
the vocabulary with two new constant symbols min and max, to be
interpreted as the minimum and the maximum element of a linear ordering,
and we prove a stronger fact that L1 ≡k L2 in the expanded vocabulary.
Let L1 have the universe {1, . . ., n} and L2 have the universe {1, . . ., m}.
Assume that the lengths of L1 and L2 are at least 2k
; that is, n, m ≥ 2k
+1. The
distance between two elements x, y of the universe, d(x, y), is simply |x − y|.
We claim that the duplicator can play in such a way that the following holds
after each round i. Let a = (a−1, a0, a1, . . . , ai) consist of a−1 = minL1
, a0 =
maxL1
and the i moves a1, . . . , ai in L1, and likewise let b = (b−1, b0, b1, . . . , bi)
consist of b−1 = minL2
, b0 = maxL2
and the i moves in L2. Then, for −1 ≤
j, l ≤ i:
30 3 Ehrenfeucht-Fra¨ıss´e Games
1. if d(aj, al) < 2k−i
, then d(bj, bl) = d(aj, al).
2. if d(aj, al) ≥ 2k−i
, then d(bj, bl) ≥ 2k−i
.
3. aj ≤ al ⇐⇒ bj ≤ bl.
(3.2)
We prove (3.2) by induction; notice that the third condition ensures partial
isomorphism, so we do prove an induction statement that says more than just
maintaining partial isomorphism.
And now a simple proof: the base case of i = 0 is immediate since
d(a−1, a0), d(b−1, b0) ≥ 2k
by assumption. For the induction step, suppose
the spoiler is making his (i + 1)st move in L1 (the case of L2 is symmetric).
If the spoiler plays one of aj, j ≤ i, the response is bj, and all the conditions
are trivially preserved. Otherwise, the spoiler’s move falls into an interval, say
aj < ai+1 < al, such that no other previously played moves are in the same
interval. By condition 3 of (3.2), this means that the interval between bj and
bl contains no other elements of b. There are two cases:
• d(aj, al) < 2k−i
. Then d(bj, bl) = d(aj, al), and the intervals [aj, al] and
[bj, bl] are isomorphic. Then we simply ﬁnd bi+1 so that d(aj, ai+1) =
d(bj, bi+1) and d(ai+1, al) = d(bi+1, bl). Clearly, this ensures that all the
conditions in (3.2) hold.
• d(aj, al) ≥ 2k−i
. In this case d(bj, bl) ≥ 2k−i
. We have three possibilities:
1. d(aj, ai+1) < 2k−(i+1)
. Then d(ai+1, al) ≥ 2k−(i+1)
, and we can choose
bi+1 so that d(bj, bi+1) = d(aj, ai+1) and d(bi+1, bl) ≥ 2k−(i+1)
. This
is illustrated in Fig. 3.2 (a), where d stands for d(aj, ai+1).
2. d(ai+1, al) < 2k−(i+1)
. This case is similar to the previous one.
3. d(aj, ai+1) ≥ 2k−(i+1)
, d(ai+1, al) ≥ 2k−(i+1)
. Since d(bj, bl) ≥ 2k−i
,
by choosing bi+1 to be the middle of the interval [bj, bl] we ensure
that d(bj, bi+1) ≥ 2k−(i+1)
and d(bi+1, bl) ≥ 2k−(i+1)
. This case is
illustrated in Fig. 3.2 (b).
Thus, in all the cases, (3.2) is preserved.
This completes the inductive proof; hence we have shown that the duplicator
can win a k-round Ehrenfeucht-Fra¨ıss´e game on L1 and L2.
Theorem 3.6, Proof # 2. The second proof relies on the composition
method: a way of composing simpler games into more complicated ones.
Before we proceed, we make the following observation. Suppose L1 ≡k L2.
Then we can assume, without loss of generality, that the duplicator has a
winning strategy in which he responds to the minimal element of one ordering
by the minimal element of the other ordering (and likewise for the maximal
elements).
Indeed, suppose the spoiler plays minL1
, the minimal element of L1. If the
duplicator responds by b > minL2
and there is at least one round left, then
in the next round the spoiler plays minL2
and the duplicator loses. If this is
the last round of the game, then the duplicator can respond by any element
3.2 Deﬁnition and Examples of Ehrenfeucht-Fra¨ıss´e Games 31
that does not exceed those previously played in L2, in particular, minL2
. The
proof for other cases is similar.
Let L be a linear ordering, and a ∈ L. By L≤a
we mean the substructure
of L that consists of all the elements b ≤ a, and by L≥a
the substructure of L
that consists of all the elements b ≥ a. The composition result we shall need
says the following.
Lemma 3.7. Let L1, L2, a ∈ L1, and b ∈ L2 be such that
L≤a
1 ≡k L≤b
2 and L≥a
1 ≡k L≥b
2 .
Then (L1, a) ≡k (L2, b).
Proof of Lemma 3.7. The strategy for the duplicator is very simple: if the
spoiler plays in L≤a
1 , the duplicator uses the winning strategy for L≤a
1 ≡k L≤b
2 ,
and if the spoiler plays in L≥a
1 , the duplicator uses the winning strategy for
L≥a
1 ≡k L≥b
2 (the case when the spoiler plays in L2 is symmetric). By the
remark preceding the lemma, the duplicator always responds to a by b and
to b by a, which implies that the strategy allows him to win in the k-round
game on (L1, a) and (L2, b).
And now we prove Theorem 3.6. The proof again is by induction on k, and
the base case is easily veriﬁed. For the induction step, assume we have two
linear orderings, L1 and L2, of length at least 2k
. Suppose the spoiler plays
a ∈ L1 (the case when the spoiler plays in L2 is symmetric). We will show
how to ﬁnd b ∈ L2 so that (L1, a) ≡k−1 (L2, b). There are three cases:
• The length of L≤a
1 is less than 2k−1
. Then let b be an element of L2
such that d(minL1
, a) = d(minL2
, b); in other words, L≤a
1
∼= L≤b
2 . Since
the length of each of L≥a
1 and L≥b
2 is at least 2k−1
, by the induction
hypothesis, L≥a
1 ≡k−1 L≥b
2 . Hence, by Lemma 3.7, (L1, a) ≡k−1 (L2, b).
• The length of L≥a
1 is less than 2k−1
. This case is symmetric to the previous
case.
• The lengths of both L≤a
1 and L≥a
1 are at least 2k−1
. Since the length of L2
is at least 2k
, we can ﬁnd b ∈ L2 such that the lengths of both L≤b
2 and
L≥b
2 are at least 2k−1
. Then, by the induction hypothesis, L≤a
1 ≡k−1 L≤b
2
and L≥a
1 ≡k−1 L≥b
2 , and by Lemma 3.7, (L1, a) ≡k−1 (L2, b).
Thus, for every a ∈ L1, we can ﬁnd b ∈ L2 such that (L1, a) ≡k−1 (L2, b) (and
symmetrically with the roles of L1 and L2 reversed). This proves L1 ≡k L2,
and completes the proof of the theorem.
32 3 Ehrenfeucht-Fra¨ıss´e Games
3.3 Games and the Expressive Power of FO
And now it is time to see why games are important. For this, we need a crucial
deﬁnition of quantiﬁer rank.
Deﬁnition 3.8 (Quantiﬁer rank). The quantiﬁer rank of a formula qr(ϕ)
is its depth of quantiﬁer nesting. That is:
• If ϕ is atomic, then qr(ϕ) = 0.
• qr(ϕ1 ∨ ϕ2) = qr(ϕ1 ∧ ϕ2) = max(qr(ϕ1), qr(ϕ2)).
• qr(¬ϕ) = qr(ϕ).
• qr(∃xϕ) = qr(∀xϕ) = qr(ϕ) + 1.
We use the notation FO[k] for all FO formulae of quantiﬁer rank up to k.
In general, quantiﬁer rank of a formula is diﬀerent from the total of number
of quantiﬁers used. For example, we can deﬁne a family of formulae by
induction: d0(x, y) ≡ E(x, y), and dk ≡ ∃z dk−1(x, z) ∧ dk−1(z, y). The quantiﬁer
rank of dk is k, but the total number of quantiﬁers used in dk is 2k
− 1.
For formulae in the prenex form (i.e., all quantiﬁers are in front, followed by
a quantiﬁer-free formula), quantiﬁer rank is the same as the total number of
quantiﬁers.
Given a set S of FO sentences (over vocabulary σ), we say that two σstructures
A and B agree on S if for every sentence Φ of S, it is the case that
A |= Φ ⇔ B |= Φ.
Theorem 3.9 (Ehrenfeucht-Fra¨ıss´e). Let A and B be two structures in a
relational vocabulary. Then the following are equivalent:
1. A and B agree on FO[k].
2. A ≡k B.
We will prove this theorem shortly, but ﬁrst we discuss how this is useful
for proving inexpressibility results.
Characterizing the expressive power of FO via games gives rise to the
following methodology for proving inexpressibility results.
Corollary 3.10. A property P of ﬁnite σ-structures is not expressible in FO
if for every k ∈ N, there exist two ﬁnite σ-structures, Ak and Bk, such that:
• Ak ≡k Bk, and
• Ak has property P, and Bk does not.
3.4 Rank-k Types 33
Proof. Assume to the contrary that P is deﬁnable by a sentence Φ. Let k =
qr(Φ), and pick Ak and Bk as above. Then Ak ≡k Bk, and thus if Ak has
property P, then so does Bk, which contradicts the assumptions.
We shall see in the next section that the if of Corollary 3.10 can be replaced
by iﬀ ; that is, Ehrenfeucht-Fra¨ıss´e games are complete for ﬁrst-order
deﬁnability.
The methodology above extends from sentences to formulas with free vari-
ables.
Corollary 3.11. An m-ary query Q on σ-structures is not expressible in FO
iﬀ for every k ∈ N, there exist two ﬁnite σ-structures, Ak and Bk, and two
m-tuples a and b in them such that:
• (Ak, a) ≡k (Bk, b), and
• a ∈ Q(Ak) and b ∈ Q(Bk).
We next see some simple examples of using games; more examples will
be given in Sect. 3.6. An immediate application of the Ehrenfeucht-Fra¨ıss´e
theorem is that even is not FO-expressible when σ is empty: we take Ak
to contain k elements, and Bk to contain k + 1 elements. However, we have
already proved this by a simple compactness argument in Sect. 3.1. But we
could not prove, by the same argument, that even is not expressible over
ﬁnite linear orders. Now we get this for free:
Corollary 3.12. even is not FO-expressible over linear orders.
Proof. Pick Ak to be a linear order of length 2k
, and Bk to be a linear order
of length 2k
+1. By Theorem 3.6, Ak ≡k Bk. The statement now follows from
Corollary 3.10.
3.4 Rank-k Types
We now further analyze FO[k] and introduce the concept of types (more precisely,
rank-k types).
First, what is FO[0]? It contains Boolean combinations of atomic formulas.
If we are interested in sentences in FO[0], these are precisely atomic
sentences: that is, sentences without quantiﬁers. In a relational vocabulary,
such sentences are Boolean combinations of formulae of the form c = c′
and
R(c1, . . . , ck), where c, c′
, c1, . . . , ck are constant symbols from σ.
Next, assume that ϕ is an FO[k + 1] formula. If ϕ = ϕ1 ∨ ϕ2, then both
ϕ1, ϕ2 are FO[k + 1] formulae, and likewise for ∧; if ϕ = ¬ϕ1, then ϕ1 ∈
FO[k + 1]. However, if ϕ = ∃xψ or ϕ = ∀xψ, then ψ is an FO[k] formula.
Hence, every formula from FO[k + 1] is equivalent to a Boolean combination
of formulae of the form ∃xψ, where ψ ∈ FO[k]. Using this, we show:
34 3 Ehrenfeucht-Fra¨ıss´e Games
Lemma 3.13. If σ is ﬁnite, then up to logical equivalence, FO[k] over σ contains
only ﬁnitely many formulae in m free variables x1, . . . , xm.
Proof. The proof is by induction on k. The base case is FO[0]; there are
only ﬁnitely many atomic formulae, and hence only ﬁnitely many Boolean
combinations of those, up to logical equivalence. Going from k to k + 1, recall
that each formula ϕ(x1, . . . , xm) from FO[k + 1] is a Boolean combination of
∃xm+1ψ(x1, . . . , xm, xm+1), where ψ ∈ FO[k]. By the hypothesis, the number
of FO[k] formulae in m + 1 free variables x1, . . . , xm+1 is ﬁnite (up to logical
equivalence) and hence the same can be concluded about FO[k + 1] formulas
in m free variables.
In model theory, a type (or m-type) of an m-tuple a over a σ structure A
is the set of all FO formulae ϕ in m free variables such that A |= ϕ(a). This
notion is too general in our setting, as the type of a over a ﬁnite A describes
(A, a) up to isomorphism.
Deﬁnition 3.14 (Types). Fix a relational vocabulary σ. Let A be a σstructure,
and a an m-tuple over A. Then the rank-k m-type of a over A
is deﬁned as
tpk(A, a) = {ϕ ∈ FO[k] | A |= ϕ(a)}.
A rank-k m-type is any set of formulae of the form tpk(A, a), where |a|= m.
When m is clear from the context, we speak of rank-k types.
In the special case of m = 0 we deal with tpk(A), deﬁned as the set of
FO[k] sentences that hold in A. Also note that rank-k types are maximally
consistent sets of formulae: that is, each rank-k type S is consistent, and for
every ϕ(x1, . . . , xm) ∈ FO[k], either ϕ ∈ S or ¬ϕ ∈ S.
At this point, it seems that rank-k types are inherently inﬁnite objects, but
they are not, because of Lemma 3.13. We know that up to logical equivalence,
FO[k] is ﬁnite, for a ﬁxed number m of free variables. Let ϕ1(x), . . . , ϕM (x)
enumerate all the nonequivalent formulae in FO[k] with free variables x =
(x1, . . . , xm). Then a rank-k type is uniquely determined by a subset K of
{1, . . . , M} specifying which of the ϕi’s belong to it. Moreover, testing that x
satisﬁes all the ϕi’s with i ∈ K and does not satisfy all the ϕj’s with j ∈ K
can be done by a single formula
αK(x) ≡
i∈K
ϕi ∧
j∈K
¬ϕj. (3.3)
Note that αK(x) itself is an FO[k] formula, since no new quantiﬁers were
introduced.
Furthermore, all the αK’s are mutually exclusive: for K = K′
, if A |=
αK(a), then A |= ¬αK′ (a). Every FO[k] formula is a disjunction of some of
the αK’s: indeed, every FO[k] formula is equivalent to some ϕi in the above
enumeration, which is the disjunction of all αK’s with i ∈ K.
Summing up, we have the following.
3.5 Proof of the Ehrenfeucht-Fra¨ıss´e Theorem 35
Theorem 3.15. a) For a ﬁnite relational vocabulary σ, the number of diﬀerent
rank-k m-types is ﬁnite.
b) Let T1, . . . , Tr enumerate all the rank-k m-types. There exist FO[k] formulae
α1(x), . . . , αr(x) such that:
• for every A and a ∈ Am
, it is the case that A |= αi(a) iﬀ tpk(A, a) = Ti,
and
• every FO[k] formula ϕ(x) in m free variables is equivalent to a disjunction
of some αi’s.
Thus, in what follows we normally associate types with their deﬁning formulae
αi’s (3.3). It is important to remember that these deﬁning formulae for
rank-k types have the same quantiﬁer rank, k.
From the Ehrenfeucht-Fra¨ıss´e theorem and Theorem 3.15, we obtain:
Corollary 3.16. The equivalence relation ≡k is of ﬁnite index (that is, has
ﬁnitely many equivalence classes).
As promised in the last section, we now show that games are complete for
characterizing the expressive power of FO: that is, the if of Corollary 3.10 can
be replaced by iﬀ.
Corollary 3.17. A property P is expressible in FO iﬀ there exists a number
k such that for every two structures A, B, if A ∈ P and A ≡k B, then B ∈ P.
Proof. If P is expressible by an FO sentence Φ, let k = qr(Φ). If A ∈ P,
then A |= Φ, and hence for B with A ≡k B, we have B |= Φ. Thus, B ∈ P.
Conversely, if A ∈ P and A ≡k B imply B ∈ P, then any two structures
with the same rank-k type agree on P, and hence P is a union of types, and
thus deﬁnable by a disjunction of some of the αi’s deﬁned by (3.3).
Thus, a property P is not expressible in FO iﬀ for every k, one can ﬁnd
two structures, Ak ≡k Bk, such that Ak has P and Bk does not.
3.5 Proof of the Ehrenfeucht-Fra¨ıss´e Theorem
We shall prove the equivalence of 1 and 2 in the Ehrenfeucht-Fra¨ıss´e theorem,
as well as a new important condition, the back-and-forth equivalence. Before
stating this condition, we brieﬂy analyze the equivalence relation ≡0.
When does the duplicator win the game without even starting? This happens
iﬀ (∅, ∅) is a partial isomorphism between two structures A and B. That
is, if c is the tuple of constant symbols, then cA
i = cA
j iﬀ cB
i = cB
j for every
i, j, and for each relation symbol R, the tuple (cA
i1
, . . . , cA
ik
) is in RA
iﬀ the
tuple (cB
i1
, . . . , cB
ik
) is in RB
. In other words, (∅, ∅) is a partial isomorphism
between A and B iﬀ A and B satisfy the same atomic sentences.
36 3 Ehrenfeucht-Fra¨ıss´e Games
We now use this as the basis for the inductive deﬁnition of back-and-forth
relations on A and B. More precisely, we deﬁne a family of relations ≃k on
pairs of structures of the same vocabulary as follows:
• A ≃0 B iﬀ A ≡0 B; that is, A and B satisfy the same atomic sentences.
• A ≃k+1 B iﬀ the following two conditions hold:
forth: for every a ∈ A, there exists b ∈ B such that (A, a) ≃k (B, b);
back: for every b ∈ B, there exists a ∈ A such that (A, a) ≃k (B, b).
We now prove the following extension of Theorem 3.9.
Theorem 3.18. Let A and B be two structures in a relational vocabulary σ.
Then the following are equivalent:
1. A and B agree on FO[k].
2. A ≡k B.
3. A ≃k B.
Proof. By induction on k. The case of k = 0 is obvious. We ﬁrst show the
equivalence of 2 and 3. Going from k to k + 1, assume A ≃k+1 B; we must
show A ≡k+1 B. Assume for the ﬁrst move the spoiler plays a ∈ A; we ﬁnd
b ∈ B with (A, a) ≃k (B, b), and thus by the hypothesis (A, a) ≡k (B, b).
Hence the duplicator can continue to play for k moves, and thus wins the
k + 1-move game. The other direction is similar.
With games replaced by the back-and-forth relation, we show the equivalence
of 1 and 3. Assume A and B agree on all quantiﬁer-rank k+1 sentences;
we must show A ≃k+1 B. We prove the forth case; the back case is identical.
Pick a ∈ A, and let αi deﬁne its rank-k 1-type. Then A |= ∃xαi(x). Since
qr(αi) = k, this is a sentence of quantiﬁer-rank k+1; hence B |= ∃xαi(x). Let
b be the witness for the existential quantiﬁer; that is, tpk(A, a) = tpk(B, b).
Hence for every σ1 sentence Ψ of qr(Ψ) = k, we have (A, a) |= Ψ iﬀ (B, b) |= Ψ,
and thus (A, a) and (B, b) agree on quantiﬁer-rank k sentences. By the hypothesis,
this implies (A, a) ≃k (B, b).
For the implication 3 → 1, we need to prove that A ≃k+1 B implies that A
and B agree on FO[k +1]. Every FO[k +1] sentence is a Boolean combination
of ∃xϕ(x), where ϕ ∈ FO[k], so it suﬃces to prove the result for sentences of
the form ∃xϕ(x). Assume that A |= ∃xϕ(x), so A |= ϕ(a) for some a ∈ A. By
forth, ﬁnd b ∈ B such that (A, a) ≃k (B, b); hence (A, a) and (B, b) agree
on FO[k] by the hypothesis. Hence, B |= ϕ(b), and thus B |= ∃xϕ(x). The
converse (that B |= ∃xϕ(x) implies A |= ∃xϕ(x)) is identical, which completes
the proof.
3.6 More Inexpressibility Results 37
⇒
⇒
Fig. 3.3. Reduction of parity to connectivity
3.6 More Inexpressibility Results
So far we have used games to prove that even is not expressible in FO, in
both ordered and unordered settings. Next, we show inexpressibility of graph
connectivity over ﬁnite graphs. In Sect. 3.1 we used compactness to show that
connectivity of arbitrary graphs is inexpressible, leaving open the possibility
that it may be FO-deﬁnable over ﬁnite graphs. We now show that this cannot
happen. It turns out that no new game argument is needed, as the proof uses
a reduction from even over linear orders.
Assume that connectivity of ﬁnite graphs is deﬁnable by an FO sentence
Φ, in the vocabulary that consists of one binary relation symbol E. Next,
given a linear ordering, we deﬁne a directed graph from it as described below.
First, from a linear ordering < we deﬁne the successor relation
succ(x, y) ≡ (x < y) ∧ ∀z (z ≤ x) ∨ (z ≥ y) .
Using this, we deﬁne an FO formula γ(x, y) such that γ(x, y) is true iﬀ one of
the following holds:
• y is the successor of the successor of x: ∃z succ(x, z) ∧ succ(z, y) , or
• x is the predecessor of the last element, and y is the ﬁrst element:
∃z (succ(x, z) ∧ ∀u(u ≤ z)) ∧ ∀u(y ≤ u), or
• x is the last element and y is the successor of the ﬁrst element (the FO
formula is similar to the one above).
Thus, γ(x, y) deﬁnes a new graph on the elements of the linear ordering; the
construction is illustrated in Fig. 3.3.
Now observe that the graph deﬁned by γ is connected iﬀ the size of the
underlying linear ordering is odd. Hence, taking ¬Φ, and substituting γ for
every occurrence of the predicate E, we get a sentence that tests even for
linear orderings. Since this is impossible, we obtain the following.
Corollary 3.19. Connectivity of ﬁnite graphs is not FO-deﬁnable.
38 3 Ehrenfeucht-Fra¨ıss´e Games
. . .
G2
k
G1
k
. . .
. . .
. . .
Fig. 3.4. Graphs G1
k and G2
k
So far all the examples of inexpressibility results proved via EhrenfeuchtFra¨ıss´e
games were fairly simple. Unfortunately, this is a rather unusual situation;
typically game proofs are hard, and often some nontrivial combinatorial
arguments are required. We now present an additional example of a game
proof, as well as a few more problems that could possibly be handled by
games, but are better left until we have seen more powerful techniques. These
show how the diﬃculty of game proofs can rapidly increase as the problems
become more complex.
Suppose that we want to test if a graph is a tree. By trees we mean directed
rooted trees. This seems to be impossible in FO. To prove this, we follow the
general methodology: that is, for each k we must ﬁnd two graphs, G1
k ≡k G2
k,
such that one of them is a tree, and the other one is not.
We choose these graphs as follows: G1
k is the graph of a successor relation
of length 2m, and G2
k has two connected components: one is the graph of a
successor relation of length m, and the other one is a cycle of length m. We
did not say what m is, and it will be clear from the proof what it should be:
at this point we just say that m depends only on k, and is suﬃciently large.
Clearly G1
k is a tree (of degree 1), and G2
k is not, so we must show G1
k ≡k
G2
k. In each of these two graphs there are two special points: the start and
the endpoint of the successor relation. Clearly these must be preserved in the
game, so we may just assume that the game starts in a position where these
points were played. That is, we let a−1, a0 be the start and the endpoint of
G1
k, and b−1, b0 be the start and the endpoint of the successor part of G2
k. We
let ai’s stand for the points played in G1
k, and bi’s for the points played in G2
k.
What do we put in the inductive hypothesis? The approach we take is
very similar to the ﬁrst proof of Theorem 3.6. We deﬁne the distance between
two elements as the length of the shortest path between them. Notice that in
the case of G2
k, the distance could be inﬁnity, as the graph has two connected
3.6 More Inexpressibility Results 39
components. We then show that the duplicator can play in a way that ensures
the following conditions after each round i:
1. if d(aj, al) ≤ 2k−i
, then d(bj, bl) = d(aj, al).
2. if d(aj, al) > 2k−i
, then d(bj, bl) > 2k−i
.
(3.4)
These are very similar to conditions (3.2) used in the proof of Theorem
3.6.
How do we prove that the duplicator can maintain these conditions? Suppose
i rounds have been played, and the spoiler makes his move in round i+1.
If the spoiler plays close (at a distance at most 2k−(i+1)
) to a previously played
point, we can apply the proof of Theorem 3.6 to show that the duplicator has
a response.
But what if the spoiler plays at a distance greater than 2k−(i+1)
from all
the previously played points? In the proof of Theorem 3.6 we were able to place
that move into some interval on a linear ordering and use some knowledge of
that interval to ﬁnd the response – but this does not work any more, since our
graphs now have a diﬀerent structure. Nevertheless, there is a way to ensure
that the duplicator can maintain the winning conditions: simply by choosing
m “very large”, we can always be sure that if fewer than k rounds of the
game have been played, there is a point at a distance greater than 2k−(i+1)
from all the previously played points in the graph. We leave it to the reader
to calculate m for a given k (it is not that much diﬀerent from the bound we
had in Theorem 3.6).
Thus, the duplicator can maintain all the conditions (3.4). In the proof of
Theorem 3.6, one of the conditions of (3.2) stated that the moves in the game
deﬁne a partial isomorphism. Here, we do not have this property, but we can
still derive that after k rounds, the duplicator achieves a partial isomorphism.
Indeed, suppose all k rounds have been played, and we have two elements ai, aj
such that there is an edge between ai and aj. This means that d(ai, aj) = 1,
and, by (3.4), d(bi, bj) = 1. Therefore, there is an edge between bi and bj.
Conversely, let there be an edge between bi and bj. If there is no edge between
ai and aj, then d(ai, aj) > 1, and, by (3.4), d(bi, bj) > 1, which contradicts
our assumption that there is an edge between them.
Thus, we have shown that G1
k ≡k G2
k, which proves the following.
Proposition 3.20. It is impossible to test, by an FO sentence, if a ﬁnite
graph is a tree.
This proof is combinatorially slightly more involved than other game proofs
we have seen, and yet it uses trees with only unary branching. So it does not
tell us whether testing the property of being an n-ary tree, for n > 1, is
expressible. Moreover, one can easily imagine that the combinatorics in a
game argument even for binary trees will be much harder. And what if we are
interested in more complex properties? For example, testing if a graph is:
40 3 Ehrenfeucht-Fra¨ıss´e Games
• a balanced binary tree (the branching factor is 2, and all the maximal
branches are of the same length);
• a binary tree with all the maximal branches of diﬀerent length;
• or even a bit diﬀerent: assuming that we know that the input is a binary
tree, can we check, in FO, if it is balanced?
It would thus be nice to have some easily veriﬁable criteria that guarantee
a winning strategy for the duplicator, and that is exactly what we shall do in
the next chapter.
3.7 Bibliographic Notes
Examples of using compactness for proving some very easy inexpressibility
results over ﬁnite models are taken from V¨a¨an¨anen [239] and Gaifman and
Vardi [89].
Characterization of the expressive power of FO in terms of the back-andforth
equivalence is due to Fra¨ıss´e [84]; the game description of the back-andforth
equivalence is due to Ehrenfeucht [62].
Theorem 3.6 is a classical application of Ehrenfeucht-Fra¨ıss´e games, and
was rediscovered many times, cf. Gurevich [117] and Rosenstein [209]. The
composition method, used in the second proof of Theorem 3.6, will be discussed
elsewhere in the book (e.g., exercise 3.15 in this chapter, as well as
Chap. 7). For a recent survey, see Makowsky [177].
The proof of inexpressibility of connectivity is standard, see, e.g., [60, 133].
Types are a central concept of model theory, see [35, 125, 201]. The proof
of the Ehrenfeucht-Fra¨ıss´e theorem given here is slightly diﬀerent from the
proof one ﬁnds in most texts (e.g., [60, 125]); an alternative proof using what
is called Hintikka formulae is presented in Exercise 3.11.
Some of the exercises for this chapter show that several classical theorems
in model theory (not only compactness) fail over ﬁnite models. For this line of
work, see Gurevich [116], Rosen [207], Rosen and Weinstein [208], Feder and
Vardi [78].
Sources for exercises:
Exercise 3.11: Ebbinghaus and Flum [60]
Exercises 3.12 and 3.13: Gurevich [116]
Exercise 3.14: Ebbinghaus and Flum [60]
Exercise 3.17: Cook and Liu [41]
Exercise 3.18: Pezzoli [199]
3.8 Exercises 41
3.8 Exercises
Exercise 3.1. Use compactness to show that the following is not FO-expressible
over ﬁnite structures in the vocabulary of one unary relation symbol U: for a structure
A, both |UA
| and |A − UA
| are even.
Exercise 3.2. Prove Lemma 3.4 for an arbitrary vocabulary.
Exercise 3.3. Prove Corollary 3.11.
Exercise 3.4. Using Ehrenfeucht-Fra¨ıss´e games, show that acyclicity of ﬁnite
graphs is not FO-deﬁnable.
Exercise 3.5. Same as in the previous exercise, for the following properties of ﬁnite
graphs:
1. Planarity.
2. Hamiltonicity.
3. 2-colorability.
4. k-colorability for any k > 2.
5. Existence of a clique of size at least n/2, where n is the number of nodes.
Exercise 3.6. We now consider a query closely related to even. Let σ be a vocabulary
that includes a unary relation symbol U. We then deﬁne a Boolean query
parityU as follows: a ﬁnite σ-structure A satisﬁes parityU iﬀ
|UA
|= 0 (mod 2).
Prove that if σ = {<, U}, where < is interpreted as a linear ordering on the universe,
then parityU is not FO-deﬁnable.
Exercise 3.7. Theorem 3.6 tells us that L1 ≡k L2 for two linear orders of length
at least 2k
. Is the bound 2k
tight? If it is not, what is the tight bound?
Exercise 3.8. Just as for linear orders, the following can be proved for Gn, the
graph of successor relation on {1, . . . , n}. There is a function f : N → N such that
Gn ≡k Gm whenever n, m ≥ f(k). Calculate f(k).
Exercise 3.9. Consider sets of the form XΦ = {n ∈ N | Ln |= Φ}, where Φ is an
FO sentence, and Ln is a linear order with n elements. Describe these sets.
Exercise 3.10. Find an upper bound, in terms of k, on the number of rank-k types.
Exercise 3.11. The goal of this exercise is to give another proof of the EhrenfeuchtFra¨ıss´e
theorem. In this proof, one constructs formulae deﬁning rank-k types explicitly,
by specifying inductively a winning condition for the duplicator.
Assume that σ is relational. For any σ-structure A and a ∈ Am
, we deﬁne
inductively formulae αk
A,a(x1, . . . , xm) as follows:
• α0
A,a(x) =
V
χ(x) where the conjunction is taken over all atomic or negated
atomic χ such that A |= χ(a). Note that the conjunction is ﬁnite.
42 3 Ehrenfeucht-Fra¨ıss´e Games
• Assuming αk
’s are deﬁned, we deﬁne
αk+1
A,a (x) =
“ ^
c∈A
∃z αk
A,ac(x, z)
”
∧
“
∀z
_
c∈A
αk
A,ac(x, z)
”
.
Prove that the following are equivalent:
1. (A, a) ≡k (B, b);
2. (A, a) ≃k (B, b);
3. for every ϕ(x) with qr(ϕ) ≤ k, we have A |= ϕ(a) iﬀ B |= ϕ(b);
4. B |= αk
A,a(b).
Using this, prove the following statement. Let Q be a query deﬁnable in FO by
a formula of quantiﬁer rank k. Then Q is deﬁnable by the following formula:
_
a∈Q(A)
αk
A,a(x).
Note that the disjunction is ﬁnite, by Lemma 3.13.
Exercise 3.12. Beth’s deﬁnability theorem is a classical result in mathematical
logic: it says that a property is deﬁnable implicitly iﬀ it is deﬁnable explicitly. Explicit
deﬁnability of a k-ary query Q on σ-structures means that there is a formula
ϕ(x1, . . . , xk) such that ϕ(A) = Q(A). Implicit deﬁnability means that there is a
sentence Φ in the language of σ expanded with a single k-ary relation P such that
for every σ-structure A, there exists a unique set P ⊆ Ak
such that (A, P) |= Φ and
P = Q(A).
Prove that Beth’s theorem fails over ﬁnite models.
Hint: P is a unary query that returns the set of even elements in a linear order.
Exercise 3.13. Craig’s interpolation is another classical result from mathematical
logic. Let σ1
, σ2
be two vocabularies, and σ = σ1
∩ σ2
. Let Φi
be a sentence over σi
,
i = 1, 2. Assume that Φ1
⊢ Φ2
. Craig’s theorem says that there exists a sentence Φ
over σ such that Φ1
⊢ Φ and Φ ⊢ Φ2
.
Using techniques similar to those in the previous exercise, prove that Craig’s
interpolation fails over ﬁnite models.
Exercise 3.14. This exercise demonstrates another example of a result from mathematical
logic that fails over ﬁnite models. The Los-Tarski theorem says that a
sentence which is preserved under extensions (that is, A ⊆ B and A |= Φ implies
B |= Φ) is equivalent to an existential sentence: a sentence built from atomic and
negated atomic formulae by using ∨, ∧, and ∃.
Prove that the Los-Tarski theorem fails over ﬁnite models.
Exercise 3.15. Winning strategies for complex structures can be composed from
winning strategies for simpler structures. Two commonly used examples of such
compositions are the subject of this exercise.
Given two structures A, B of the same vocabulary σ, their Cartesian product A×
B is deﬁned as a σ-structure whose universe is A×B, each constant c is interpreted as
a pair (cA
, cB
), and each m-ary relation P is interpreted as {((a1, b1), . . . , (am, bm)) |
(a1, . . . , am) ∈ PA
, (b1, . . . , bm) ∈ PB
}.
If the vocabulary contains only relation symbols, the disjoint union A
‘
B for
two structures with A ∩ B = ∅ has the universe A ∪ B, and each relation P is
interpreted as PA
∪ PB
.
Assume A1 ≡k A2 and B1 ≡k B2. Show that:
3.8 Exercises 43
• A1 × B1 ≡k A2 × B2;
• A1
‘
B1 ≡k A2
‘
B2.
Exercise 3.16. The n×m grid is a graph whose set of nodes is {(i, j) | i ≤ n, j ≤ m}
for some n, m ∈ N, and whose edges go from (i, j) to (i + 1, j) and to (i, j + 1). Use
composition of Ehrenfeucht-Fra¨ıss´e games to show that there are no FO sentences
testing if n = m (n > m) for the n × m grid.
Exercise 3.17. Consider ﬁnite structures which are disjoint unions of ﬁnite linear
orderings. Such structures occur in AI applications under the name of blocks world.
Use Ehrenfeucht-Fra¨ıss´e games to show that the theory of such structures is decidable,
and ﬁnitely axiomatizable.
Exercise 3.18. Fix a relational vocabulary σ that has at least one unary and one
ternary relation. Prove that the following is Pspace-complete. Given k, and two
σ-structures A and B, is A ≡k B?
What happens if k is ﬁxed?
Exercise 3.19.∗
A sentence Φ of vocabulary σ is called positive if no symbol from σ
occurs under the scope of an odd number of negations in Φ. We say that a sentence
Φ is preserved under surjective homomorphisms if A |= Φ and h(A) = B implies
B |= Φ, where h : A → B is a homomorphism such that h(A) = B. Lyndon’s
theorem says that if Φ is preserved under surjective homomorphisms (where A, B
could be arbitrary structures), then Φ is equivalent to a positive sentence.
Does Lyndon’s theorem hold in the ﬁnite? That is, if Φ is preserved under surjective
homomorphisms over ﬁnite structures, is it the case that, over ﬁnite structures,
Φ is equivalent to a positive sentence?
4
Locality and Winning Games
Winning games becomes nontrivial even for fairly simple examples. But often
we can avoid complicated combinatorial arguments, by using rather simple
suﬃcient conditions that guarantee a winning strategy for the duplicator. For
ﬁrst-order logic, most such conditions are based on the idea of locality, best
illustrated by the example in Fig. 4.1.
Suppose we want to show that the transitive closure query is not expressible
in FO. We assume, to the contrary, that it is deﬁnable by a formula
ϕ(x, y), and then use the locality of FO to conclude that such a formula can
only see up to some distance r from its free variables, where r is determined
by ϕ. Then we take a successor relation A long enough so that the distance
from a and b to each other and the endpoints is bigger than 2r – in that
case, ϕ cannot see the diﬀerence between (a, b) and (b, a), but our assumption
implies that A |= ϕ(a, b) ∧ ¬ϕ(b, a) since a precedes b.
The goal of this chapter is to formalize this type of reasoning, and use it
to provide winning strategies for the duplicator. Such strategies will help us
ﬁnd easy criteria for FO-deﬁnability.
Throughout the chapter, we assume that the vocabulary σ is purely relational;
that is, contains only relation symbols. All the results extend easily
to the case of vocabularies that have constant symbols (see Exercise 4.1), but
restricting to purely relational vocabularies often makes notations simpler.
4.1 Neighborhoods, Hanf-locality, and Gaifman-locality
We start by deﬁning neighborhoods that formalize the concept of “seeing up
to distance r from the free variables”.
Deﬁnition 4.1. Given a σ-structure A, its Gaifman graph, denoted by G(A),
is deﬁned as follows. The set of nodes of G(A) is A, the universe of A. There
is an edge (a1, a2) in G(A) iﬀ a1 = a2, or there is a relation R in σ such that
for some tuple t ∈ RA
, both a1, a2 occur in t.
46 4 Locality and Winning Games
... ... ... ... ... ... ... ...a b
r r
Fig. 4.1. A local formula cannot distinguish (a, b) from (b, a)
Note that G(A) is an undirected graph. If A is an undirected graph to start
with, then G(A) is simply A together with the diagonal {(a, a) | a ∈ A}. If A
is a directed graph, then G(A) simply forgets about the orientation (and adds
the diagonal as well).
By the distance dA(x, y) we mean the distance in the Gaifman graph: that
is, the length of the shortest path from x to y in G(A). If there is no such
path, then dA(x, a) = ∞. It is easy to verify that the distance satisﬁes all the
usual properties of a metric: dA(x, y) = 0 iﬀ x = y, dA(x, y) = dA(y, x), and
dA(x, z) ≤ dA(x, y) + dA(y, z), for all x, y, z.
If we are given two tuples, a = (a1, . . . , an) and b = (b1, . . . , bm), and an
element c, then
dA(a, c) = min
1≤i≤n
dA(ai, c),
dA(a, b) = min
1≤i≤n, 1≤j≤m
dA(ai, bj).
Furthermore, ac stands for the n + 1-tuple (a1, . . . , an, c), and ab stands for
the n + m-tuple (a1, . . . , an, b1, . . . , bm).
Recall that we use the notation σn for σ expanded with n constant symbols.
Deﬁnition 4.2. Let σ contain only relation symbols, and let A be a σstructure,
and a = (a1, . . . , an) ∈ An
. The radius r ball around a is the
set
BA
r (a) = {b ∈ A | dA(a, b) ≤ r}.
The r-neighborhood of a in A is the σn-structure NA
r (a), where:
• the universe is BA
r (a);
• each k-ary relation R is interpreted as RA
restricted to BA
r (a); that is,
RA
∩ (BA
r (a))k
;
• n additional constants are interpreted as a1, . . . , an.
Note that since we deﬁne a neighborhood around an n-tuple as a σnstructure,
for any isomorphism h between two isomorphic neighborhoods
NA
r (a1, . . . , an) and NB
r (b1, . . . , bn), it must be the case that h(ai) = bi, 1 ≤
i ≤ n.
4.1 Neighborhoods, Hanf-locality, and Gaifman-locality 47
Deﬁnition 4.3. Let A, B be σ-structures, where σ only contains relation symbols.
Let a ∈ An
and b ∈ Bn
. We write
(A, a) ⇆d (B, b)
if there exists a bijection f : A → B such that for every c ∈ A,
NA
d (ac) ∼= NB
d (bf(c)).
We shall often deal with the case of n = 0; then A⇆dB means that for
some bijection f : A → B,
NA
d (c) ∼= NB
d (f(c)) for all c ∈ A.
The ⇆d relation says, in a sense, that locally two structures look the
same, with respect to a certain bijection f; that is, f sends each element
c into f(c) that has the same neighborhood. The lemma below summarizes
some properties of this relation:
Lemma 4.4. 1. (A, a)⇆d(B, b) ⇒|A|=|B |.
2. (A, a)⇆d(B, b) ⇒ (A, a)⇆d′ (B, b), for d′
≤ d.
3. (A, a)⇆d(B, b) ⇒ NA
d (a) ∼= NB
d (b).
Recall that a neighborhood of an n-tuple is a σn-structure. By an isomorphism
type of such structures we mean an equivalence class of ∼= on
STRUCT[σn]. We shall use the letter τ (with sub- and superscripts) to denote
isomorphism types. Instead of saying that a structure belongs to τ, we shall
say that it is of the isomorphism type τ.
If τ is an isomorphism type of σn-structures, and a ∈ An
, we say that a
d-realizes τ in A if NA
d (a) is of type τ. If d is understood from the context,
we say that a realizes τ.
The following is now easily proved from the deﬁnition of the ⇆d relation.
Lemma 4.5. Let A, B ∈ STRUCT[σ]. Then A ⇆d B iﬀ for each isomorphism
type τ of σ1-structures, the number of elements of A and B that drealize
τ is the same.
We now formulate the ﬁrst locality criterion.
Deﬁnition 4.6 (Hanf-locality). An m-ary query Q on σ-structures is Hanflocal
if there exists a number d ≥ 0 such that for every A, B ∈ STRUCT[σ],
a ∈ Am
, b ∈ Bm
,
(A, a) ⇆d (B, b) implies a ∈ Q(A) ⇔ b ∈ Q(B) .
The smallest d for which the above condition holds is called the Hanf-locality
rank of Q and is denoted by hlr(Q).
48 4 Locality and Winning Games
. . .
. . .
one cycle of length 2m
G2
m
two cycles of length m
G1
m
. .
. .. .
. .
Fig. 4.2. Connectivity is not Hanf-local
Most commonly Hanf-locality is used for Boolean queries; then the deﬁnition
says that for some d ≥ 0, for every A, B ∈ STRUCT[σ], the condition
A ⇆d B implies that A and B agree on Q.
Using Hanf-locality for proving that a query Q is not deﬁnable in a logic
L then amounts to showing:
• that every L-deﬁnable query is Hanf-local, and
• that Q is not Hanf-local.
We now give the canonical example of using Hanf-locality. We show, by a
very simple argument, that graph connectivity is not Hanf-local; it will then
follow that graph connectivity is not expressible in any logic that only deﬁnes
Hanf-local Boolean queries.
Assume to the contrary that the graph connectivity query Q is Hanf-local,
and hlr(Q) = d. Let m > 2d + 1, and choose two graphs G1
m and G2
m as
shown in Fig. 4.2. Their sets of nodes have the same cardinality. Let f be an
arbitrary bijection between the nodes of G1
m and G2
m. Since each cycle is of
length > 2d + 1, the d-neighborhood of any node a is the same: it is a chain
of length 2d with a in the middle. Hence, G1
m ⇆d G2
m, and they must agree
on Q, but G2
m is connected, and G1
m is not. Thus, graph connectivity is not
Hanf-local.
While Hanf-locality works well for Boolean queries, a diﬀerent notion is
often helpful for m-ary queries, m > 0.
Deﬁnition 4.7 (Gaifman-locality). An m-ary query Q, m > 0, on σstructures,
is called Gaifman-local if there exists a number d ≥ 0 such that for
every σ-structure A and every a1, a2 ∈ Am
,
4.2 Combinatorics of Neighborhoods 49
NA
d (a1) ∼= NA
d (a2) implies a1 ∈ Q(A) ⇔ a2 ∈ Q(A) .
The minimum d for which the above condition holds is called the locality rank
of Q, and is denoted by lr(Q).
Note the diﬀerence between Hanf- and Gaifman-locality: the former relates
two diﬀerent structures, while the latter is talking about deﬁnability in one
structure.
The methodology for proving inexpressibility of queries using Gaifmanlocality
is then as follows:
• ﬁrst we show that all m-ary queries, m > 0, deﬁnable in a logic L are
Gaifman-local,
• then we show that a given query Q is not Gaifman-local.
We shall see many examples of logics that deﬁne only Gaifman-local
queries. At this point, we give a typical example of a query that is not
Gaifman-local. The query is transitive closure, and we already saw that it is
not Gaifman-local. Recall Fig. 4.1. Assume that the transitive closure query Q
is Gaifman-local, and let lr(Q) = r. If a, b are at a distance > 2r +1 from each
other and the start and the endpoints, then the r-neighborhoods of (a, b) and
(b, a) are isomorphic, since each is a disjoint union of two chains of length 2r.
We know that (a, b) belongs to the output of Q; hence by Gaifman-locality,
(b, a) is in the output as well, which contradicts the assumption that Q deﬁnes
transitive closure.
These examples demonstrate that locality tools are rather easy to use to
obtain inexpressibility results. Our goal now is to show that FO-deﬁnable
queries are both Hanf-local and Gaifman-local.
4.2 Combinatorics of Neighborhoods
The main technical tool for proving locality is combinatorial reasoning about
neighborhoods. We start by presenting simple properties of neighborhoods;
proofs are left as an exercise for the reader.
Lemma 4.8. • Assume that A, B ∈ STRUCT[σ] and h : NA
r (a) → NB
r (b)
is an isomorphism. Let d ≤ r. Then h restricted to BA
d (a) is an isomorphism
between NA
d (a) and NB
d (b).
• Assume that A, B ∈ STRUCT[σ] and h : NA
r (a) → NB
r (b) is an isomorphism.
Let d + l ≤ r and x be a tuple from BA
l (a). Then h(BA
d (x)) =
BB
d (h(x)), and NA
d (x) and NB
d (h(x)) are isomorphic.
• Let A, B ∈ STRUCT[σ] and let a1 ∈ An
, b1 ∈ Bn
for n ≥ 1, and a2 ∈
Am
, b2 ∈ Bm
for m ≥ 1. Assume that NA
r (a1) ∼= NB
r (b1), NA
r (a2) ∼=
NB
r (b2), and dA(a1, a2), dB(b1, b2) > 2r+1. Then NA
r (a1a2) ∼= NB
r (b1b2).
50 4 Locality and Winning Games
From now on, we shall use the notation
a ≈A,B
r b
for NA
r (a) ∼= NB
r (b), omitting A and B when they are understood. We shall
also write d(·, ·) instead of dA(·, ·) when A is understood.
The main technical result of this section is the lemma below.
Lemma 4.9. If A⇆dB and a ≈A,B
3d+1 b, then (A, a)⇆d(B, b).
Proof. We need to deﬁne a bijection f : A → B such that ac ≈A,B
d bf(c)
for every c ∈ A. Since a ≈A,B
3d+1 b, there is an isomorphism h : NA
3d+1(a) →
NB
3d+1(b). Then the restriction of h to BA
2d+1(a) is an isomorphism between
NA
2d+1(a) and NB
2d+1(b). Since |A|=|B |, we obtain
|A − BA
2d+1(a)| = |B − BB
2d+1(b)| .
Now consider an arbitrary isomorphism type τ of a d-neighborhood of a
single point. Assume that c ∈ BA
2d+1(a) realizes τ in A. Since h is an isomorphism
of 3d + 1-neighborhoods, BA
d (c) ⊆ BA
3d+1(a) and thus h(c) ∈ BB
2d+1(b)
realizes τ. Similarly, if c ∈ BB
2d+1(b) realizes τ, then so does h−1
(c) ∈ BA
2d+1(a).
Hence, the number of elements in BA
2d+1(a) and BB
2d+1(b) that realize τ is the
same.
Since A⇆dB, the number of elements of A and of B that realize τ is the
same. Therefore,
|{a ∈ A − BA
2d+1(a) | a d-realizes τ}|
= |{b ∈ B − BB
2d+1(b) | b d-realizes τ}|
(4.1)
for every τ. Using (4.1), we can ﬁnd a bijection g : A − BA
2d+1(a) → B −
BB
2d+1(b) such that c ≈d g(c) for every c ∈ A − BA
2d+1(a).
We now deﬁne f by
f(c) =
h(c) if c ∈ BA
2d+1(a)
g(c) if c ∈ BA
2d+1(a).
It is clear that f is a bijection A → B.
We claim that ac ≈d bf(c) for every c ∈ A. This is illustrated in Fig. 4.3.
If c ∈ BA
2d+1(a), then BA
d (c) ⊆ BA
3d+1(a), and ac ≈d bh(c) because h is an
isomorphism. If c ∈ BA
2d+1(a), then f(c) = g(c) ∈ BB
2d+1(b), and c ≈d g(c).
Since d(c, a), d(g(c), b) > 2d + 1, by Lemma 4.8, ac ≈d bg(c).
The following corollary is very useful in establishing locality of logics.
Corollary 4.10. If (A, a)⇆3d+1(B, b), then there exists a bijection f : A →
B such that
∀c ∈ A (A, ac) ⇆d (B, bf(c)).
4.3 Locality of FO 51
A
A
B
B
3d + 1
2d + 1
d
h
3d + 1
2d + 1
d
d
g
a
a
b
b
d
d
d
Fig. 4.3. Illustration of the proof of Lemma 4.9
Proof. By the deﬁnition of the ⇆ relation, there exists a bijection f : A → B,
such that for any c ∈ A, ac ≈A,B
3d+1 bf(c). Since A⇆3d+1B, we have A⇆dB.
By Lemma 4.9, (A, ac)⇆d(B, bf(c)).
4.3 Locality of FO
We now show that FO-deﬁnable queries are both Hanf-local and Gaifmanlocal.
In fact, it suﬃces to prove the former, due to the following result.
Theorem 4.11. If Q is a Hanf-local non-Boolean query, then Q is Gaifmanlocal,
and lr(Q) ≤ 3 · hlr(Q) + 1.
Proof. Suppose Q is an m-ary query on STRUCT[σ], m > 0, and hlr(Q) = d.
Let A be a σ-structure, and let a1 ≈A
3d+1 a2. Since A⇆dA, by Lemma 4.9,
52 4 Locality and Winning Games
(A, a1) ⇆d (A, a2), and hence a1 ∈ Q(A) iﬀ a2 ∈ Q(A), which proves lr(Q) ≤
3d + 1.
Theorem 4.12. Every FO-deﬁnable query Q is Hanf-local. Moreover, if Q is
deﬁned by an FO[k] formula (that is, an FO formula whose quantiﬁer rank is
at most k), then
hlr(Q) ≤
3k
− 1
2
.
Proof. By induction on the quantiﬁer rank. If k = 0, then (A, a)⇆0(B, b)
means that (a, b) deﬁnes a partial isomorphism between A and B, and thus a
and b satisfy the same atomic formulas. Hence hlr(Q) = 0, if Q is deﬁned by
an FO[0] formula.
Suppose Q is deﬁned by a formula of quantiﬁer rank k +1. Such a formula
is a Boolean combination of formulae of the form ∃zϕ(x, z) where qr(ϕ) ≤ k.
Note that it follows immediately from the deﬁnition of Hanf-locality that if
ψ is a Boolean combination of ψ1, . . . , ψl, and for all i ≤ l, hlr(ψi) ≤ d, then
hlr(ψ) ≤ d. Thus, it suﬃces to prove that the Hanf-locality rank of the query
deﬁned by ∃zϕ is at most 3d + 1, where d is the Hanf-locality rank of the
query deﬁned by ϕ.
To see this, let (A, a) ⇆3d+1 (B, b). By Corollary 4.10, we ﬁnd a bijection
f : A → B such that (A, ac) ⇆d (B, bf(c)) for every c ∈ A. Since hlr(ϕ) = d,
we have A |= ϕ(a, c) iﬀ B |= ϕ(b, f(c)). Hence,
A |= ∃z ϕ(a, z)
⇒ A |= ϕ(a, c) for some c ∈ A
⇒ B |= ϕ(b, f(c))
⇒ B |= ∃z ϕ(b, z).
The same proof shows B |= ∃z ϕ(b, z) implies A |= ∃z ϕ(a, z). Thus, a and
b agree on the query deﬁned by ∃zϕ(x, z), which completes the proof.
Combining Theorems 4.11 and 4.12, we obtain:
Corollary 4.13. Every FO-deﬁnable m-ary query Q, m > 0, is Gaifmanlocal.
Moreover, if Q is deﬁnable by an FO[k] formula, then
lr(Q) ≤
3k+1
− 1
2
.
Since we know that graph connectivity is not Hanf-local and transitive
closure is not Gaifman-local, we immediately obtain, without using games,
that these queries are not FO-deﬁnable.
We can give rather easy inexpressibility proofs for many queries. Below,
we provide two examples.
4.3 Locality of FO 53
d d d d
d−1 d−1 d d+1
Fig. 4.4. Balanced binary trees are not FO-deﬁnable
Balanced Binary Trees
This example was mentioned at the end of Chap. 3. Suppose we are given a
graph, and we want to test if it is a balanced binary tree. We now sketch the
proof of inexpressibility of this query in FO; details are left as an exercise for
the reader.
Suppose a test for being a balanced binary tree is deﬁnable in FO, say by
a sentence Φ of quantiﬁer rank k. Then we know that it is a Hanf-local query,
with Hanf-locality rank at most r = (3k
− 1)/2. Choose d to be much larger
than r, and consider two trees shown in Fig. 4.4.
In the ﬁrst tree, denoted by T1, the subtrees hanging at all four nodes
on the second level are balanced binary trees of depth d; in the second tree,
denoted by T2, they are balanced binary trees of depths d − 1, d − 1, d, and
d + 1. We claim that T1 ⇆r T2 holds.
First, notice that the number of nodes and the number of leaves in T1
and T2 is the same. If d is suﬃciently large, these trees realize the following
isomorphism types of neighborhoods:
• isomorphism types of r-neighborhoods of nodes a at a distance m from
the root, m ≤ r;
• isomorphism types of r-neighborhoods of nodes a at a distance m from a
leaf, m ≤ r;
• the isomorphism type of the r-neighborhood of a node a at a distance > r
from both the root and all the leaves.
Since the number of leaves and the number of nodes are the same, it
is easy to see that each type of an r-neighborhood has the same number of
nodes realizing it in both T1 and T2, and hence T1 ⇆r T2. But this contradicts
Hanf-locality of the balanced binary tree test, since T1 is balanced, and T2 is
not.
Same Generation
The query we consider now is same generation: given a graph, two nodes a
and b are in the same generation if there is a node c (common ancestor) such
54 4 Locality and Winning Games
r
...
... ...
a0 a1 ad
b0 b1 bd bd+1 b2d+1
Fig. 4.5. Inexpressibility of same generation
that the shortest paths from c to a and from c to b have the same length. This
query is most commonly computed on trees; in this case a, b are in the same
generation if they are at the same distance from the root.
We now give a very simple proof that the same-generation query Qsg
is not FO-deﬁnable. Assume to the contrary that it is FO-deﬁnable, and
lr(Qsg) = d. Consider a tree T with root r and two branches, one with nodes
a0, a1, . . . , ad (where ai+1 is the successor of ai) and the other one with nodes
b0, b1, . . . , bd, . . . , b2d+1, see Fig. 4.5.
It is clear that (ad, bd) ≈T
d (ad, bd+1), while ad, bd are in the same generation,
and ad, bd+1 are not.
In most examples seen so far, locality ranks (for either Hanf- or Gaifmanlocality)
were exponential in the quantiﬁer rank. We now show a simple exponential
lower bound for the locality rank; precise bounds will be given in
Exercise 4.11.
Suppose that σ is the vocabulary of undirected graphs: that is, σ = {E}
where E is binary. Deﬁne the following formulae:
• d0(x, y) ≡ E(x, y),
• d1(x, y) ≡ ∃z (d0(x, z) ∧ d0(y, z)), . . .,
• dk+1(x, y) ≡ ∃z(dk(x, z) ∧ dk(y, z)).
For an undirected graph, dk(a, b) holds iﬀ there is a path of length 2k
between
a and b; that is, if the distance between a and b is at most 2k
. Hence, lr(dk) ≥
2k−1
. However, qr(dk) = k, which shows that locality rank can be exponential
in the quantiﬁer rank.
4.4 Structures of Small Degree
In this section, we shall see a large class of structures for which very simple
criteria for FO-deﬁnability can be obtained. These are structures in which all
the degrees are bounded by a constant. If we deal with undirected graphs,
degrees are the usual degrees of nodes; if we deal with directed graphs, they
are in- and out-degrees. In general, we use the following deﬁnition.
4.4 Structures of Small Degree 55
Deﬁnition 4.14. Let σ be a relational vocabulary, R an m-ary symbol in σ,
and A ∈ STRUCT[σ]. For a ∈ A and i ≤ m, deﬁne degreeA
R,i(a) as the
cardinality of the set
{(a1, . . . , ai−1, a, ai+1, . . . , am) ∈ RA
| a1, . . . , ai−1, ai+1, . . . , am ∈ A}.
That is, degreeA
R,i(a) is the number of tuples in RA
that have a in the ith
position.
Deﬁne deg set(A) to be the set of all the numbers of the form degreeA
R,i(a),
where a ∈ A, R ∈ σ, and i is at most the arity of R. That is,
deg set(A) = {degreeA
R,i(a) | a ∈ A, R ∈ σ, i ≤ arity(R)}.
Finally, STRUCTl[σ] stands for
{A ∈ STRUCT[σ] | deg set(A) ⊆ {0, . . . , l}}.
In other words, STRUCTl[σ] consists of σ-structures in which all degrees do
not exceed l.
We shall also be applying deg set to outputs of queries: by deg set(Q(A)),
for an m-ary query Q, we mean the set of all degrees realized in the structure
whose only m-ary relation is Q(A); that is, deg set( A, Q(A) ).
When we talk of structures of small degree, we mean STRUCTl[σ] for some
ﬁxed l ∈ N.
There is another way of deﬁning structures of small degree, essentially
equivalent to the way we use here. Instead of deﬁning degrees for m-ary
relations, one can use only the deﬁnition of degrees for nodes of an undirected
graph, and deﬁne structures of small degrees as structures A where
deg set(G(A)) ⊆ {0, . . ., l} for some l ∈ N. Recall that G(A) is the Gaifman
graph of A, so in this case we are talking about the usual degrees in a graph.
However, this is essentially the same as the deﬁnition of STRUCTl[σ].
Lemma 4.15. For every relational vocabulary σ, there exist two functions
fσ, gσ : N → N such that
1. deg set(G(A)) ⊆ {0, . . ., fσ(l)} for every A ∈ STRUCTl[σ], and
2. A ∈ STRUCTgσ(l)[σ] for every A with deg set(G(A)) ⊆ {0, . . . , l}.
One reason to study structures of small degrees is that many queries behave
particularly nicely on them. We capture this notion of nice behavior by the
following deﬁnition.
Deﬁnition 4.16. Let σ be relational. An m-ary query Q on σ-structures, m >
0, has the bounded number of degrees property (BNDP) if there exists a
function fQ : N → N such that for every l ≥ 0 and every A ∈ STRUCTl[σ],
|deg set(Q(A))| ≤ fQ(l).
56 4 Locality and Winning Games
Notice a certain asymmetry of this deﬁnition: our assumption is that all
the numbers in deg set(A) are small, but the conclusion is that the cardinality
of deg set(Q(A)) is small. We cannot possibly ask for all the numbers
in deg set(Q(A)) to be small and still say anything interesting about
FO-deﬁnable queries: consider, for example, the query deﬁned by ϕ(y, z) ≡
∃x(x = x). On every structure A with | A |= n > 0, it deﬁnes the complete
graph on n nodes, where every node has the same degree n. Hence, some degrees
in deg set(Q(A)) do depend on A, but the number of diﬀerent degrees is
determined by deg set(A) and the query.
It is usually very easy to show that a query does not have the BNDP.
Consider, for example, the transitive closure query. Assume that its input is a
successor relation Gn on n nodes. Then deg set(Gn) = {0, 1}. The transitive
closure of Gn is a linear order Ln on n nodes, and deg set(Ln) = {0, . . ., n−1},
showing that the transitive closure query does not have the BNDP.
We next show that the BNDP is closely related to locality concepts.
Theorem 4.17. Let Q be a Gaifman-local m-ary query, m > 0. Then Q has
the BNDP.
Proof. Let Q be Gaifman-local with lr(Q) = d. We assume, without loss of
generality, that m ≥ 2, since unary queries clearly have the BNDP.
Next, we need the following claim. Let nd(k) be deﬁned inductively by
nd(0) = d, nd(k + 1) = 3 · nd(k) + 1. That is, nd(k) = 3k
· d + (3k
− 1)/2 for
k ≥ 0.
Claim 4.18. Let a ≈A
nd(k) b. Then there is a bijection f : Ak
→ Ak
such that
ac ≈A
d bf(c) for every c ∈ Ak
.
The proof of Claim 4.18 is by induction on k. For k = 0 there is nothing to
prove. Assume that it holds for k, and prove it for k + 1. Let r = nd(k); then
nd(k+1) = 3r+1. Let a ≈A
3r+1 b. Then, by Lemma 4.9, (A, a) ⇆r (A, b). That
is, there exists a bijection g : A → A such that for every c ∈ A, ac ≈A
r bg(c).
By the induction hypothesis, we then know that for each c ∈ A, there exists
a bijection gc : Ak
→ Ak
such that for every e ∈ Ak
,
ace ≈A
d bg(c)gc(e).
We thus deﬁne a bijection f : Ak+1
→ Ak+1
as follows: if c = ce, where
e ∈ Ak
, then f(c) = g(c)gc(e). Clearly, ac ≈A
d bf(c). This proves the claim.
Now we prove the BNDP. First, note that for every vocabulary σ, there
exists a function Gσ : N × N → N such that for every A ∈ STRUCTl[σ], the
size of BA
d (a) is at most Gσ(l, d). Thus, there exists a function Fσ : N × N →
N such that every structure A in STRUCTl[σ] can realize at most Fσ(l, d)
isomorphism types of d-neighborhoods of a point.
Now consider Q(A), for A ∈ STRUCTl[σ], and note that for any two
a, b ∈ A with a ≈A
nd(m−1) b,
4.5 Locality of FO Revisited 57
|{c ∈ Am−1
| ac ∈ Q(A)}| = |{c ∈ Am−1
| bc ∈ Q(A)}|, (4.2)
by Claim 4.18. In particular, (4.2) implies that the degrees of a and b in
Q(A) (in the ﬁrst position of an m-tuple) are the same. This is because
degree
Q(A)
1 (c), the degree of an element c, corresponding to the ﬁrst position
of the m-ary relation Q(A), is precisely the cardinality of the set
{c ∈ Am−1
| cc ∈ Q(A)}. Thus, the number of diﬀerent degrees in Q(A)
corresponding to the ﬁrst position in the m-tuple is at most Fσ(l, nd(m − 1)),
and hence
|deg set(Q(A))| ≤ m · Fσ(l, nd(m − 1)). (4.3)
Since the upper bound in (4.3) depends on l, m, d, and σ only, this proves the
BNDP.
Corollary 4.19. Every FO-deﬁnable query has the BNDP.
Balanced Binary Trees Revisited
We now revisit the balanced binary tree test, and give a simple proof of its
inexpressibility in FO. In fact, we show that this test is inexpressible even
if it is restricted to binary trees. That is, there is no FO-deﬁnable Boolean
query Qbbt such that, for a binary tree T , the output Qbbt(T ) is true iﬀ T is
balanced.
Assume, to the contrary, that such a query is FO-deﬁnable. We now construct
a binary FO-deﬁnable query Q which fails the BNDP – this would
contradict Corollary 4.19.
The new query Q works as follows. It takes as an input a binary tree T ,
and for every two nonleaf nodes a, b ﬁnds their successors a′
, a′′
and b′
, b′′
. It
then constructs a new tree Ta,b by removing the edges from a to a′
, a′′
and
from b to b′
, b′′
, and instead by adding the edges from a to b′
, b′′
and from b
to a′
, a′′
. It then puts (a, b) in the output if Qbbt(Ta,b) is true (see Fig. 4.6).
Clearly, Q is FO-deﬁnable, if Qbbt is.
Assume that T itself is a balanced binary tree; that is a structure in
STRUCT2[σ]. Then for two nonleaf nodes a, b, the pair (a, b) is in Q(T ) iﬀ a, b
are at the same distance from the root. Hence, for a balanced binary tree T of
depth n, the graph Q(T ) is a disjoint union of n − 1 cliques of diﬀerent sizes,
and thus | deg set(Q(T )) |= n − 1. Hence, Q fails the BNDP, which proves
that Qbbt is not FO-deﬁnable.
4.5 Locality of FO Revisited
In this section, we start by analyzing the proof of Hanf-locality of FO,
and discover that it establishes a stronger statement than that of Theorem
4.12. We characterize a new notion of expressibility via a stronger version of
Ehrenfeucht-Fra¨ıss´e games, which will later be used to prove bounds on logics
58 4 Locality and Winning Games
a b
a′
a′′
b′
b′′
Fig. 4.6. Changing successors of nodes in a balanced binary tree
with counting quantiﬁers. The question that we ask then is: are there more
precise and restrictive locality criteria that can be stated for FO? The answer
to this is positive, and we shall present two such results: Gaifman’s theorem,
and the threshold equivalence criterion.
First, we show how to avoid the restriction that no constant symbols occur
in σ; that is, we extend the notions of the r-ball and r-neighborhood to the case
of arbitrary relational vocabularies σ (vocabularies without function symbols).
Let c = (c1, . . . , cn) list all the constant symbols of σ. Then
BA
r (a) = {b ∈ A | dA(b, a) ≤ r or dA(b, cA
) ≤ r}.
The r-neighborhood of a, with |a|= m, is deﬁned as the structure NA
r (a) in
the vocabulary σm (σ extended with m constants), whose universe is BA
r (a),
the interpretations of σ-relations and constants are inherited from A, and the
m extra constants are interpreted as a.
One can check that all the results proved so far extend to the setting that
allows constants (see Exercise 4.1). From now on, we apply all the locality
concepts to relational vocabularies.
We can also use the notion of locality to state when A ≡0 B; that is,
when the duplicator wins the Ehrenfeucht-Fra¨ıss´e game on A and B without
even starting. This happens if and only if (∅, ∅) is a partial isomorphism, or,
equivalently, NA
0 (∅) ∼= NB
0 (∅).
We now deﬁne a new equivalence relation ≃bij
k as follows.
• A ≃bij
0 B if A ≡0 B;
4.5 Locality of FO Revisited 59
• A ≃bij
k+1 B if there is a bijection f : A → B such that
forth: for each a ∈ A, we have (A, a) ≃bij
k (B, f(a));
back: for each b ∈ B, we have (A, f−1
(b)) ≃bij
k (B, b).
One can easily see that just one of forth and back suﬃces: that is, forth
and back are equivalent, since f is a bijection.
The notion of the back-and-forth described in Sect. 3.5 was equivalent
to the Ehrenfeucht-Fra¨ıss´e game. We can also describe the new notion of
back-and-forth as a game, called a bijective Ehrenfeucht-Fra¨ıss´e game (or just
bijective game). Let A and B be two structures in a relational vocabulary.
The k-round bijective game is played by the same two players, the spoiler and
the duplicator. If |A| = |B |, then the duplicator loses before the game even
starts. In the ith round, the duplicator ﬁrst selects a bijection fi : A → B.
Then the spoiler moves in exactly the same way as in the Ehrenfeucht-Fra¨ıss´e
game: that is, he plays either ai ∈ A or bi ∈ B. The duplicator responds by
either f(ai) or f−1
(bi). As in the Ehrenfeucht-Fra¨ıss´e game, the duplicator
wins if, after k rounds, the moves (a, b) form a winning position: that is, (a, cA
)
and (b, cB
) are a partial isomorphism between A and B.
If the duplicator has a winning strategy in the k-round bijective game on
A and B, we write A ≡bij
k B. Clearly, it is harder for the duplicator to win the
bijective game; that is, A ≡bij
k B implies A ≡k B. In the bijective game, the
duplicator does not simply come up with responses to all the possible moves
by the spoiler, but he has to establish a one-to-one correspondence between
the spoiler’s moves and his responses.
The following is immediate from the deﬁnitions.
Lemma 4.20. A ≃bij
k B iﬀ A ≡bij
k B.
By Corollary 4.10, (A, u)⇆3d+1(B, v) implies the existence of a bijection
f : A → B such that (A, uc) ⇆d (B, vf(c)) for all c ∈ A. Since the winning
condition in the bijective game is that NA
0 (a) ∼= NB
0 (b), where a and b are the
moves of the game on A and B, by induction on k we conclude:
Corollary 4.21. If (A, a) ⇆(3k−1)/2 (B, b), then (A, a) ≡bij
k (B, b).
Bijective games, as will be seen, characterize the expressive power of a
certain logic. Since the bijective game is harder to win for the duplicator than
the ordinary Ehrenfeucht-Fra¨ıss´e game, such a logic must be more expressive
than FO. Hence, the tool of Hanf-locality will be applicable to a certain extension
of FO. We shall see how it works when we discuss logics with counting
in Chap. 8.
Since the most general locality-based bounds apply to more restricted
games than the ordinary Ehrenfeucht-Fra¨ıss´e games, and hence to more expressive
logics, it is natural to ask whether more speciﬁc locality criteria can
be stated for FO. We now present two such criteria.
60 4 Locality and Winning Games
We start with Gaifman’s theorem. First, a few observations are needed. If σ
is a relational vocabulary, and m is the maximum arity of a relation symbol in
it, m ≥ 2, then the Gaifman graph G(A) is deﬁnable by a formula of quantiﬁer
rank m − 2. (Note that for the case of unary relations, the Gaifman graph is
simply {(a, a) | a ∈ A} and hence is deﬁnable by the formula x = y.)
We show this for the case of a single ternary relation R; a general proof
should be obvious. The Gaifman graph is then deﬁned by the formula
(x = y) ∨ ∃z
R(x, y, z) ∨ R(x, z, y) ∨ R(y, x, z)
∨ R(y, z, x) ∨ R(z, x, y) ∨ R(z, y, x)
.
Since the Gaifman graph is FO-deﬁnable, so is the r-ball of any tuple x.
That is, for any ﬁxed r, there is a formula d≤r
(y, x) such that A |= d≤r
(b, a)
iﬀ dA(b, a) ≤ r. Similarly, there are formulae d=r
and d>r
. We can next deﬁne
local quantiﬁcation
∃y ∈ Br(x) ϕ ∀y ∈ Br(x) ϕ
simply as abbreviations: ∃y ∈ Br(x) ϕ stands for ∃y d≤r
(y, x) ∧ ϕ , and
∀y ∈ Br(x) ϕ stands for ∀y d≤r
(y, x) → ϕ .
For a ﬁxed r, we say that a formula ψ(x) is r-local around x, and write this
as ψ(r)
(x), if all quantiﬁcation in ψ is of the form ∃y ∈ Br(x) or ∀y ∈ Br(x).
Theorem 4.22 (Gaifman). Let σ be relational. Then every FO formula
ϕ(x) over σ is equivalent to a Boolean combination of the following:
• local formulae ψ(r)
(x) around x;
• sentences of the form
∃x1, . . . , xs
s
i=1
α(r)
(xi) ∧
1≤i<j≤s
d>2r
(xi, xj) .
Furthermore,
• the transformation from ϕ to such a Boolean combination is eﬀective;
• if ϕ itself is a sentence, then only sentences of the above form appear in
the Boolean combination;
• if qr(ϕ) = k, and n is the length of x, then the bounds on r and s are
r ≤ 7k
, s ≤ k + n.
Notice that Gaifman-locality of FO is an immediate corollary of Gaifman’s
theorem (hence the name). However, the proof we presented earlier is much
simpler than the proof of Gaifman’s theorem (Exercise 4.9), and the bounds
obtained are better.
4.5 Locality of FO Revisited 61
Thus, Gaifman-locality can be strengthened for the case of FO formulae.
Then what about Hanf-locality? The answer, as it turns out, is positive, if
one’s attention is restricted to structures in which degrees are bounded. We
start with the following deﬁnition.
Deﬁnition 4.23 (Threshold equivalence). Given two structures A, B in
a relational vocabulary, we write A ⇆thr
d,m B if for every isomorphism type τ
of a d-neighborhood of a point either
• both A and B have the same number of points that d-realize τ, or
• both A and B have at least m points that d-realize τ.
Thus, if m were allowed to be inﬁnity, A ⇆thr
d,∞ B would be the usual
deﬁnition of A ⇆d B. In the new deﬁnition, however, we are only interested
in the number of elements that d-realize a type of neighborhood up to a
threshold: below the threshold, the numbers must be the same, but above it,
they do not have to be.
Theorem 4.24. For each k, l > 0, there exist d, m > 0 such that for A, B ∈
STRUCTl[σ],
A ⇆thr
d,m B implies A ≡k B.
Proof. The proof is very similar to the proof of Hanf-locality of FO. We deﬁne
inductively r0 = 0, ri+1 = 3ri+1, take d = rk−1, and prove that the duplicator
can play the Ehrenfeucht-Fra¨ıss´e game on A and B in such a way that after
i rounds (or: with k − i rounds remaining),
NA
rk−i
(ai) ∼= NB
rk−i
(bi), (4.4)
where ai, bi are points played in the ﬁrst i rounds of the game.
It only remains to specify m. Recall from the proof of Theorem 4.17 that
there is a function Gσ : N × N such that the maximum size of a radius d
neighborhood of a point in a structure in STRUCTl[σ] is Gσ(d, l). We take m
to be k · Gσ(rk, l).
The rest is by induction on i. For the ﬁrst move, suppose the spoiler plays
a ∈ A. By A ⇆thr
rk,m B, the duplicator can ﬁnd b ∈ B with NA
rk
(a) ∼= NB
rk
(b).
Now assume (4.4) holds after i rounds. That is, NA
3r+1(ai) ∼= NB
3r+1(bi),
where r = rk−(i+1). We have to show that (4.4) holds after i + 1 rounds (i.e.,
with k − (i + 1) rounds remaining). Suppose in round i + 1 the spoiler plays
a ∈ A (the case of a move in B is identical). If a ∈ BA
2r+1(ai), the response
is by the isomorphism between NA
3r+1(ai) and NB
3r+1(bi), which guarantees
(4.4). If a ∈ BA
2r+1(ai), let τ be the isomorphism type of the r-neighborhood
of a. To ensure (4.4), all we need is to ﬁnd b ∈ B such that b r-realizes τ in
B, and dB(b, bi) > 2r + 1 – then such an element b would be the response of
the duplicator.
62 4 Locality and Winning Games
Assume that there is no such element b. Since there is an element a ∈ A
that r-realizes τ in A, there must be an element b′
∈ B that r-realizes τ in B.
Then all such elements b′
must be in NB
2r+1(bi). Let there be s of them.
Notice that the cardinality of NB
2r+1(bi) does not exceed m = k · Gσ(rk, l).
This is because the length of bi is at most k, the size of each rk neighborhood
is at most Gσ(rk, l), and 2r + 1 ≤ rk.
Therefore, s ≤ m, and from A⇆thr
d,mB we see that there are exactly s elements
a′
∈ A that r-realize τ in A. But by the isomorphism between NA
3r+1(ai)
and NB
3r+1(bi) we know that NA
2r+1(ai) alone contains s such elements, and
hence there are at least s + 1 of them in A. This contradiction shows that we
can ﬁnd b that r-realizes τ in B outside of NB
2r+1(bi), which completes the
proof of (4.4) and the theorem.
The threshold equivalence is a useful tool when in the course of proving
inexpressibility of a certain property, one constructs pairs of structures Ak, Bk
whose universes have diﬀerent cardinalities: then Hanf-locality is inapplicable.
For example, consider the following query over graphs. Suppose the input
graph is a simple cycle with loops on some nodes (i.e., it has edges
(a1, a2), (a2, a3), . . . , (an−1, an), (an, a1), with all ais distinct, as well as some
edges of the form (ai, ai)). The question is whether the number of loops is
even. An attempt to prove that it is not FO-deﬁnable using Hanf-locality
does not succeed: for any d > 0, and any two structures A, B with A ⇆d B,
the numbers of nodes with loops in A and B are equal.
However, the threshold equivalence helps us. Assume that the above query
Q is expressible by a sentence of quantiﬁer rank k. Then apply Theorem 4.24 to
k and 2 (the maximum degree in graphs described above), and ﬁnd d, m > 0.
We now construct a graph Gd,n for any n > 0, as a cycle on which the distance
between any two consecutive nodes with loops is 2d+2, and the number of such
nodes with loops is n. One can then easily check that Gd,m+1 ⇆thr
d,m Gd,m+2
and hence the two must agree on Q. This is certainly impossible, showing that
Q is not FO-deﬁnable.
Note that in this example, Gd,m+1 ⇆r Gd,m+2 for any r > 0, since the
cardinalities of Gd,m+1 and Gd,m+2 are diﬀerent, and hence Hanf-locality is
not applicable.
4.6 Bibliographic Notes
The ﬁrst locality result for FO was Hanf’s theorem, formulated in 1965 by
Hanf [120] for inﬁnite models. The version for the ﬁnite case was presented
by Fagin, Stockmeyer, and Vardi in [76]. In fact, [76] proves what we call the
threshold equivalence for FO, and what we call Hanf-locality is stated as a
corollary.
4.7 Exercises 63
Gaifman’s theorem is from [88]; Gaifman-locality, inspired by it, was introduced
by Hella, Libkin, and Nurmonen [123], who also proved Theorem
4.11. The proof of Hanf-locality for FO follows Libkin [167].
The bounded number of degrees property (BNDP) is from Libkin and
Wong [169] (where it was called BDP, and proved only for FO-deﬁnable queries
over graphs). Dong, Libkin and Wong [57] showed that every Gaifman-local
query has the BNDP, and a simpler proof was given by Libkin [166].
Bijective games were introduced by Hella [121], and the connection between
them and Hanf-locality is due to Nurmonen [188]; the presentation
here follows [123].
Sources for exercises:
Exercise 4.9: Gaifman [88]
Exercises 4.10, 4.11, and 4.12: Libkin [166]
Exercise 4.13: Dong, Libkin, and Wong [57]
Exercise 4.14: Schwentick and Barthelmann [217]
Exercise 4.15: Schwentick [215]
4.7 Exercises
Exercise 4.1. Verify that all the results in Sects. 4.1–4.4 extend to vocabularies
with constant symbols.
Exercise 4.2. Prove Lemma 4.4.
Exercise 4.3. Prove Lemma 4.5.
Exercise 4.4. Prove Lemma 4.8.
Exercise 4.5. Prove Lemma 4.15.
Exercise 4.6. Use Hanf-locality to give a simple proof that graph acyclicity and
testing if a graph is a tree are not FO-deﬁnable.
Exercise 4.7. Consider colored graphs: that is, structures of vocabulary
{E, U1, . . . , Uk} where E is binary and U1, . . . , Uk are unary (i.e., Ui deﬁnes the
set of nodes of color i). Prove that neither connectivity nor transitive closure are
FO-deﬁnable over colored graphs.
Exercise 4.8. Provide a complete proof that testing if a binary tree is balanced is
not FO-deﬁnable.
Exercise 4.9. Prove Theorem 4.22.
Exercise 4.10. In all the proofs in this chapter we obtained bounds on locality
ranks of the order O(3k
), where k is the quantiﬁer rank. And yet the exponential
lower bound was O(2k
). The goal of this exercise is to reduce the upper bound from
O(3k
) to O(2k
), at the expense of a slightly more complicated proof.
64 4 Locality and Winning Games
Let x = (x1, . . . , xn), and let I = {I1, . . . , Im} be a partition of {1, . . . , n}. The
subtuple of x that consists of the components whose indices are in Ij is denoted by
xI
j .
Let r > 0. Given two structures, A and B, and a ∈ An
, b ∈ Bn
, we say that a
and b are (I, r)-similar if the following hold:
• NA
r (aI
j ) ∼= NB
r (bI
j ) for all j = 1, . . . , m;
• d(aI
j , aI
l ) > r for all l = j;
• d(bI
j , bI
l ) > r for all l = j.
We call a and b r-similar if there exists a partition I such that a and b are (I, r)similar.
A formula ϕ has the r-separation property if A |= ϕ(a) ↔ ϕ(b) whenever a
and b are r-similar.
Your ﬁrst task is to prove that a formula has the separation property iﬀ it is
Gaifman-local.
Next, prove the following. If r > 0, A⇆rB, and a, b are 2r-similar, then there
exists a bijection f : A → B such that, for every c ∈ A, the tuples ax and bf(c) are
r-similar.
Use this result to show that lr(ϕ) ≤ 2k
for every FO formula ϕ of quantiﬁer rank
k.
Exercise 4.11. Deﬁne functions Hanf rankFO, Gaifman rankFO : N → N as follows:
Hanf rankFO(n) = max{hlr(ϕ) | ϕ ∈ FO, qr(ϕ) = n},
Gaifman rankFO(n) = max{lr(ϕ) | ϕ ∈ FO, qr(ϕ) = n}.
Assume that the vocabulary is purely relational. Prove that for every n > 1,
Hanf rankFO(n) = 2n−1
− 1 and Gaifman rankFO(n) = 2n
− 1.
Exercise 4.12. Exponential lower bounds for locality rank were achieved on formulae
of quantiﬁer rank n with the total number of quantiﬁers exponential in n.
Could it be that locality rank is polynomial in the number of quantiﬁers?
Your goal is to show that the answer is negative. More precisely, show that there
exist FO formulae with n quantiﬁers and locality rank O(
√
2
n
).
Exercise 4.13. The BNDP was formulated in a rather asymmetric way: the assumption
was that ∀i ∈ deg set(A) (i ≤ l), and the conclusion that |deg set(Q(A))|≤
fQ(l). A natural way to make it more symmetric is to introduce the following property
of a query Q: there exists a function f′
Q : N → N such that
|deg set(Q(A))| ≤ f′
Q(|deg set(A)|)
for ever structure A.
Prove that there are FO-deﬁnable queries on ﬁnite graphs that violate the above
property.
Exercise 4.14. Recall that a formula ϕ(x) is r-local around x if all the quantiﬁcation
is of the form ∃y ∈ Br(x) and ∀y ∈ Br(x). We now say that ϕ(x) is basic r-local
around x if it is a Boolean combination of formulae of the form α(xi), where xi is a
component of x, and α(xi) is r-local around xi. A formula is local (or basic local)
around x if it is r-local (or basic r-local) around x for some r.
4.7 Exercises 65
Prove that every FO formula ϕ(x) that is local around x is logically equivalent
to a formula that is basic local around x.
Use this result to prove that any FO sentence is logically equivalent to a sentence
of the form
∃x1 . . . ∃xn∀y ϕ(x1, . . . , xn, y),
where ϕ(x1, . . . , xn, y) is local around (x1, . . . , xn, y).
Exercise 4.15. This exercise presents a suﬃcient condition that guarantees a winning
strategy by the duplicator. It shows that if two structures look similar (meaning
that the duplicator has a winning strategy), and are extended to bigger structures in
a “similar way”, then the duplicator has a winning strategy on the bigger structures
as well.
Let A, B be two structures of the same vocabulary that contains only relation
symbols. Let A0, B0 be their substructures, with universes A0 and B0, respectively,
and let A1 and B1 be substructures of A and B whose universes are A − A0 and
B − B0.
For every a ∈ A, dA(a, A0) is, as usual, min{dA (a, a0) | a0 ∈ A0}, and dB (b, B0)
is deﬁned similarly. Let A(r) (B(r)) be the substructure of A (respectively, B) whose
universe is {a | dA(a, A0) ≤ r} (respectively, {b | dB (b, B0) ≤ r}). We write
A(r) ≡dist
k B(r)
if A(r) ≡k B(r) and, whenever ai, bi are moves in the ith round, dA(ai, A0) =
dB (bi, B0). We also write
A1
∼=
dist
B1
if there is an isomorphism h : A1 → B1 such that dA(a, A0) = dB (h(a), B0) for
every a ∈ A − A0.
Now assume that the following two conditions hold:
1. A(2k) ≡dist
k B(2k), and
2. A1
∼=dist
B1.
Prove that A ≡k B.
Exercise 4.16. Let σ consist of one binary relation E, and let Φ be a σ-sentence.
Prove that it is decidable whether Φ has a model in STRUCT1[σ]; that is, one can
decide if there is a ﬁnite graph G in which all in- and out-degrees are 0 and 1 such
that G |= Φ.
5
Ordered Structures
We know how to prove basic results about FO; so now we start adding things
to FO. One way to make FO more expressive is to include additional operations
on the universe. For example, in database applications, data items stored
in a database are numbers, strings, etc. Both numbers and strings could be
ordered; on numbers we have arithmetic operations, on strings we have concatenation,
substring tests, and so on. As query languages routinely use those
operations, one may want to study them in the context of FO.
In this chapter, we describe a general framework of adding new operations
on the domain of a ﬁnite model. The main concept is that of invariant queries,
which do not depend on a particular interpretation of the new operations. We
show that such an addition could increase the expressiveness of a logic, even
for properties that do not mention those new operations. We then concentrate
on one operation of special importance: a linear order on the ﬁnite universe.
We study FO(<) – that is, FO with an additional linear order < on the
universe, and study its expressive power.
Adding ordering will be of importance for almost all logics that we study
(the only exception is fragments of second-order logic, where linear orderings
are deﬁnable). We shall observe the following general phenomenon: for any
logic that cannot deﬁne a linear ordering, adding one increases the expressive
power, even for invariant queries.
5.1 Invariant Queries
We start with an example. Suppose we have a vocabulary σ, and an additional
vocabulary σ<,+ = {<, +}, where < is a binary relation symbol, and + is a
ternary relation symbol. The intended interpretation is as follows. Given a set
A, the relation < is interpreted as a linear ordering on it, say a1 < . . . < an,
if A = {a1, . . . , an}. Then + is interpreted as
{(ai, aj, ak) | ai, aj, ak ∈ A and i + j = k}.
68 5 Ordered Structures
Recall that the query even(A) testing if | A |= 0 (mod 2) is not expressible
over σ-structures: we proved this by using Ehrenfeucht-Fra¨ıss´e games. Now
assume that we are allowed to use σ<,+ symbols in the query. Then we can
write:
Φ = ¬∃x (x = x) ∨ ∃x∃y (x + x = y) ∧ ¬∃z (y < z) .
That is, either the universe is empty, or y is the largest element of the universe
and y = x + x for some x. Then Φ tests if |A|= 0 (mod 2).
However, one has to be careful with this statement. We cannot write A |=
Φ iﬀ even(A) for a σ-structure A, simply because Φ is not a sentence of
vocabulary σ. The structure in which Φ is checked is an expansion of A with
an interpretation of predicate symbols in σ<,+. That is, if A<,+ is a structure
with universe A in which <, + are interpreted as was shown above, then
(A, A<,+) |= Φ iﬀ even(A).
Here by (A, A<,+) we mean the structure whose universe is A, the symbols
from σ are interpreted as in A, and <, + are interpreted as in A<,+.
Before giving a general deﬁnition, we make another important observation.
If we ﬁnd any other interpretation for symbols < and +, as long as < is a
linear ordering on A and + is the addition corresponding to <, the result of
the query deﬁned by Φ will be the same. This is the idea of invariance: no
matter how the extra relations are interpreted, the result of the query is the
same.
We now formalize this concept. Recall that if σ and σ′
are two disjoint
vocabularies, A ∈ STRUCT[σ], A′
∈ STRUCT[σ′
], and A, A′
have the same
universe A, then (A, A′
) stands for a structure of vocabulary σ ∪ σ′
, in which
the universe is A, and the interpretation of σ (respectively, σ′
) is inherited
from A (A′
).
Deﬁnition 5.1. Let σ and σ′
be two disjoint vocabularies, and let C be a class
of σ′
-structures. Let A ∈ STRUCT[σ]. A formula ϕ(x) in the language of σ∪σ′
is called C-invariant on A if for any two C structures A′
and A′′
on A we have
ϕ[(A, A′
)] = ϕ[(A, A′′
)].
A formula ϕ is C-invariant if it is C-invariant on every σ-structure.
If ϕ(x) is C-invariant, we associate with it an m-ary query Qϕ, where
m =|x|. It is given by
a ∈ Qϕ(A) iﬀ (A, A′
) |= ϕ(a),
where A′
is some σ′
-structure in C whose universe is A. By invariance, it does
not matter which C-structure A′
is used.
We shall write FO + C for a class of all queries on σ ∪ σ′
-structures, and
5.2 The Power of Order-invariant FO 69
(FO + C)inv
for the class of queries Qϕ, where ϕ is a C-invariant formula over σ ∪ σ′
.
The most important case for us is when C is the class of ﬁnite linear
orderings. In that case, we write < instead of C and use the notation
(FO+<)inv.
We refer to queries in this class as order-invariant queries.
Notice that (FO+<)inv refers to a class of queries, rather than a logic. In
fact, we shall see in Chap. 9 (Exercise 9.3) that it is undecidable whether an
FO sentence is <-invariant.
Coming back to our example of expressing even with < and +, the sentence
Φ is a C<,+-invariant sentence, where C<,+ is the class of ﬁnite structures
A, <, + , with a1 < . . . < an being a linear order on A, and + deﬁned as
{(ai, aj, ak) | i + j = k}. The Boolean query QΦ deﬁned by this invariant
sentence is precisely even.
In some cases, establishing bounds on FO + C and (FO + C)inv is easy. For
example, the proof that the bounded number of degrees property (BNDP)
holds for FO shows that adding any structure of bounded degree would not
violate the BNDP. Thus, we have the following result.
Proposition 5.2. Let C ⊆ STRUCTl[σ′
] for a ﬁxed l ≥ 0. Then
(FO + C) queries have the BNDP. In particular, (FO + C) cannot express the
transitive closure query.
The situation becomes much more interesting when degrees are not
bounded; for example, when C is the class of linear orderings. We study it
in the next section.
5.2 The Power of Order-invariant FO
While queries in (FO+C)inv are independent of any particular structure from
C, the mere presence of such a structure can have an impact on the expressive
power.
In fact, this can be demonstrated for the class of (FO+<)inv queries. The
main result we prove here is the following.
Theorem 5.3 (Gurevich). There are (FO+<)inv queries that are not FOdeﬁnable.
That is,
FO (FO+<)inv.
In the rest of the section we present the proof of this theorem. The proof is
constructive: we explicitly generate the separating query, show that it belongs
to (FO+<)inv, and then prove that it is not FO-deﬁnable.
70 5 Ordered Structures
We consider structures in the vocabulary σ = {⊆} where ⊆ is a binary
relation symbol. The intended interpretation of σ-structures of interest to us
is ﬁnite Boolean algebras: that is, 2X
, ⊆ , where X is a ﬁnite set.
We ﬁrst show that there is a sentence ΦBA such that A |= ΦBA iﬀ A
is of the form 2X
, ⊆ for a ﬁnite X. For that, we shall need the following
abbreviations:
• ⊥(x) ≡ ∀z (x ⊆ z) (intended interpretation of x then is the empty set);
• ⊤(x) ≡ ∀z (z ⊆ x) (x is the maximal element with respect to ⊆);
• x ∪ y = z ≡ (x ⊆ z) ∧ (y ⊆ z) ∧ ∀u (x ⊆ u) ∧ (y ⊆ u) → (z ⊆ u) ;
• x ∩ y = z ≡ (z ⊆ x) ∧ (z ⊆ y) ∧ ∀u (u ⊆ x) ∧ (u ⊆ y) → (u ⊆ z) ;
• atom(x) ≡ ¬⊥(x) ∧ ∀z z ⊆ x → (z = x ∨ ⊥(z)) (i.e., x is an atom,
or a singleton set);
• x = ¯y ≡ ∀z (x ∪ y = z → ⊤(z)) ∧ ∀z (x ∩ y = z → ⊥(z)) (x is the
complement of y).
The sentence ΦBA is now the usual axiomatization for atomic Boolean
algebras; that is, it is a conjunction of sentences that assert that ⊆ is a partial
ordering, ∪ and ∩ exist, are unique, and satisfy the distributivity law and the
absorption law (x ∩ (x ∪ y) = x); that the least and the greatest elements ⊥
and ⊤ are unique; and that complements are unique and satisfy De Morgan’s
laws. Clearly, this can be stated as an FO sentence.
We now formulate the separating query Qeven
atom:
Qeven
atom(A) = true ⇔ A |= ΦBA and |{a | A |= atom(a)}|= 0 (mod 2).
That is, it checks if the number of atoms in the ﬁnite Boolean algebra A
is even.
Lemma 5.4. Qeven
atom ∈ (FO+<)inv.
Proof. Let < be an ordering on the universe of A. It orders the atoms of the
Boolean algebra: a0 < . . . < an−1. To check if the number of atoms is even,
we check if there is a set that contains all the atoms in even positions (i.e.,
a0, a2, a4, . . .) and does not contain an−1. For that, we deﬁne the following
formulae:
• ﬁrstatom(x) ≡ atom(x) ∧ ∀y (atom(y) → x ≤ y).
• lastatom(x) ≡ atom(x) ∧ ∀y (atom(y) → y ≤ x).
• nextatom(x, y) ≡
atom(x) ∧ atom(y) ∧ (x < y)
∧ ¬∃z atom(z) ∧ (x < z) ∧ (z < y)
.
5.2 The Power of Order-invariant FO 71
That is, ﬁrstatom(x) is true of a0, lastatom(x) is true of an−1, and
nextatom(x, y) is true of any pair (ai−1, ai), 0 < i ≤ n − 1.
Based on these, we express Qeven
atom by the sentence below:
∃z


∀x ﬁrstatom(x) → x ⊆ z
∧ ∀x lastatom(x) → ¬(x ⊆ z)
∧ ∀x, y nextatom(x, y) → ((x ⊆ z) ↔ ¬(y ⊆ z))

 .
That is, the above sentence is true iﬀ the set containing the even atoms
a0, a2, . . . does not contain an−1. Note that the set z may be diﬀerent for
each diﬀerent interpretation of the linear ordering <, but the sentence still
tests if the number of atoms is even, which is a property independent of a
particular ordering.
Lemma 5.5. Qeven
atom is not FO-deﬁnable (in the vocabulary {⊆}).
Proof. We shall use a game argument. Notice that locality does not help us
here: in 2X
, ⊆ , for any two sets C, D ⊆ X, the distance between them is at
most 2, since ∅ ⊆ C, D.
The proof illustrates the idea of composing a larger Ehrenfeucht-Fra¨ıss´e
game from smaller and simpler games, already seen in Chap. 3.
In the proof, we shall be using games on Boolean algebras. We ﬁrst observe
that if 2X
, ⊆ ≡k 2Y
, ⊆ , then we can assume, without any loss of generality,
that the duplicator has a winning strategy in which he responds to the empty
set by the empty set, to X by Y , and to Y by X. Indeed, suppose the spoiler
plays ∅ in 2X
, and the duplicator responds with Y ′
= ∅ in 2Y
. If there is one
more round left in the game, the spoiler would play the empty set in 2Y
, and
the duplicator has no response in 2X
, contradicting the assumption that he
has a winning strategy. Thus, in every round but the last, the duplicator must
respond to ∅ by ∅. If the spoiler plays ∅ in 2X
in the last round, it is contained
in all the other moves played in 2X
, and the duplicator can respond by ∅ in
2Y
to maintain partial isomorphism. The proof for the other cases is similar.
Next, we need the following composition result.
Claim 5.6. Let 2X1
, ⊆ ≡k 2Y1
, ⊆ and 2X2
, ⊆ ≡k 2Y2
, ⊆ . Assume that
X1 ∩ X2 = Y1 ∩ Y2 = ∅. Then
2X1∪X2
, ⊆, X1, X2 ≡k 2Y1∪Y2
, ⊆, Y1, Y2 . (5.1)
Proof of Claim 5.6. Let Ai, Bi, i ≤ k, be the moves by the spoiler and the
duplicator in the game (5.1). Let A1
i = Ai ∩ X1, A2
i = Ai ∩ X2, and likewise
B1
i = Bi ∩ Y1, B2
i = Bi ∩ Y2. The winning strategy for the duplicator is as
follows. Suppose i−1 rounds have been played, and in the ith round the spoiler
plays Ai ⊆ X1 ∪ X2 (the case of the spoiler playing in Y1 ∪ Y2 is symmetric).
The duplicator considers the position
((A1
1, . . . , A1
i−1), (B1
1 , . . . , B1
i−1))
72 5 Ordered Structures
in the game on 2X1
, ⊆ and 2Y1
, ⊆ , and ﬁnds his response B1
i ⊆ Y1 to
A1
i . Similarly, he ﬁnds B2
i ⊆ Y2 as the response to A2
i in the position
((A2
1, . . . , A2
i−1), (B2
1, . . . , B2
i−1)) in the game on 2X2
, ⊆ and 2Y2
, ⊆ . His
response to Ai is then Bi = B1
i ∪ B2
i . Clearly, playing in such a way, the duplicator
preserves the ⊆ relation. Furthermore, it follows from the observation
made before the claim that this strategy also preserves the constants: that is,
if the spoiler plays X1, then the duplicator responds by Y1, etc. Hence, the
duplicator has a winning strategy for (5.1). This proves the claim.
The lemma now follows from the claim below.
Claim 5.7. Let |X |, |Y |≥ 2k
. Then
2X
, ⊆ ≡k 2Y
, ⊆ .
Indeed, assume Qeven
atom is deﬁnable by an FO-sentence of quantiﬁer rank k.
Take any X of odd cardinality and any Y of even cardinality, greater than
2k
. By Claim 5.7, 2X
, ⊆ ≡k 2Y
, ⊆ , and hence they must agree on Qeven
atom
which is clearly false.
Proof of Claim 5.7. It will be proved by induction on k. The cases of k = 0, 1
are obvious. Going from k to k+1, suppose we have X, Y with |X |, |Y |≥ 2k+1
.
Assume, without loss of generality, that the spoiler plays A ⊆ X in 2X
, ⊆ .
There are three possibilities.
1. | A |< 2k
. Pick an arbitrary B ⊆ Y with | B |=| A |. Then both | X − A |
and |Y −B | exceed 2k
. Thus, by the induction hypothesis, 2X−A
, ⊆ ≡k
2Y −B
, ⊆ . Furthermore, 2A
, ⊆ ∼= 2B
, ⊆ , which implies a weaker fact
that 2A
, ⊆ ≡k 2B
, ⊆ . By Claim 5.6,
2X
, ⊆, A ≡k 2Y
, ⊆, B ,
meaning that after the duplicator responds to A with B, he can continue
playing for k more rounds. This ensures a winning position, for the duplicator,
after k + 1 rounds.
2. |X − A|< 2k
. Pick an arbitrary B ⊆ Y with |Y − B |=|X − A|. Then the
proof follows case 1.
3. | A |≥ 2k
and | X − A |≥ 2k
. Since | Y |≥ 2k+1
, we can ﬁnd B ⊆ Y with
|B |≥ 2k
and |Y − B |≥ 2k
. Then, by the induction hypothesis,
2A
, ⊆ ≡k 2B
, ⊆ ,
2X−A
, ⊆ ≡k 2Y −B
, ⊆ ,
and we again conclude 2X
, ⊆, A ≡k 2Y
, ⊆, B , thus proving the winning
strategy for the duplicator in k + 1 moves.
5.3 Locality of Order-invariant FO 73
This completes the proof of the claim, and of Theorem 5.3.
Gurevich’s theorem is one of many instances of the proper containment
L (L+<)inv, which holds for many logics of interest in ﬁnite model theory.
We shall see similar results for logics with counting, ﬁxed point logics, several
inﬁnitary logics, and some restrictions of second-order logic.
5.3 Locality of Order-invariant FO
We know how to establish some expressivity bounds on invariant queries:
for example, if extra relations are of bounded degree, then invariant queries
have the BNDP. There are important classes of auxiliary relations that are
of bounded degree. For example, the class Succ of successor relations: that
is, graphs of the form {(a0, a1), (a1, a2), . . . , (an−1, an)} where all ai’s are distinct.
Then the BNDP applies to FO + Succ, because for any A ∈ Succ,
deg set(A) = {0, 1}.
Adding order instead of successor destroys the BNDP, because for an ordering
L on n elements, deg set(L) = {0, . . ., n − 1}. Moreover, while FO+<
is local, locality does not tell us anything interesting. With a linear ordering,
the distance between any two distinct elements is 1. Therefore, if a structure
A is ordered by <, then N
(A,<)
1 (a) = (A, <, a). Hence, every query is trivially
Gaifman-local with locality rank 1.
Gaifman-locality is a useful concept when applied to “sparse” structures,
and structures with a linear order are not such. However, invariant queries
do not talk about the order: they simply use it, but they are deﬁned on σstructures
for σ that does not need to include an ordering. Hence, if we could
establish locality of order-invariant FO-deﬁnable queries, it would give us very
useful bounds on the expressive power of (FO+<)inv. All the locality proofs
we presented earlier would not work in this case, since FO formulae deﬁning
invariant queries do use the ordering. Nevertheless, the following is true.
Theorem 5.8 (Grohe-Schwentick). Every m-ary query in (FO+ <)inv,
m ≥ 1, is Gaifman-local.
This theorem gives us easy bounds for FO+<. For example, to show that
the transitive closure query is not deﬁnable in FO+ <, one notices that it
is an invariant query. Hence, if it were expressible in FO+ <, it would have
been an (FO+<)inv query, and thus Gaifman-local. We know, however, that
transitive closure is not Gaifman-local.
The proof of the theorem is quite involved, and we shall prove a slightly
easier result (that is still suﬃcient for most inexpressibility proofs). We say
that an m-ary query Q, m > 0, is weakly local if there exists a number d ≥ 0
such that for any structure A and any a1, a2 ∈ Am
with
a1 ≈A
d a2 and BA
d (a1) ∩ BA
d (a2) = ∅
74 5 Ordered Structures
it is the case that
a1 ∈ Q(A) iﬀ a2 ∈ Q(A).
That is, the only diﬀerence between weak locality and the usual Gaifmanlocality
is that for the former, the neighborhoods are required to be disjoint.
The result that we prove is the following.
Proposition 5.9. Every unary query in (FO+<)inv is weakly local.
The proof will demonstrate all the main ideas required to prove Theorem
5.8; completing the proof of the theorem is the subject of Exercises 5.8 and
5.9.
The statement of Proposition 5.9 is also very powerful, and suﬃces for
many bounds on the expressive power of FO+<. Suppose, for example, that
we want to show that the same-generation query over colored trees is not in
FO+<. Since same generation is order-invariant, it suﬃces to show that it is
not weakly local, and thus not in (FO+<)inv.
We consider colored trees as structures of the vocabulary (E, C), where E
is binary and C is unary, and assume, towards a contradiction, that a binary
query Qsg (same generation) is deﬁnable in FO+< by a formula ϕ(x, y). Let
ψ(x) ≡ ∃y C(y) ∧ ϕ(x, y) .
Then ψ deﬁnes a unary order-invariant query, testing if there is a node y in
the set C such that (x, y) is in the output of Qsg. To show that it is not
weakly local, assume to the contrary that it is, and construct a tree T as
follows. Let d witness the weak locality of the query deﬁned by ψ. Then T has
three branches coming from the root, two of length d + 1 and one of length
d + 2. Let the leaves be a, b, c, with c being the leaf of the branch of length
d + 2. The set C is then {a}. Note that b ≈T
d c and their balls of radius d are
disjoint, and yet (T, <) |= ψ(b) ∧ ¬ψ(c) for any ordering <. Hence, ψ is not
weakly local, and thus Qsg is not deﬁnable in FO+<.
We now move to the proof of Proposition 5.9. First, we present the main
idea of the proof. For that, we deﬁne the radius r sphere, r > 0, of a tuple a
in a structure A as
SA
r (a) = BA
r (a) − BA
r−1(a).
That is, SA
r (a) is the set of elements at distance exactly r from a. As usual, the
superscript A
will be omitted when irrelevant or understood. We ﬁx, for the
proof, the vocabulary of the structure to be that of graphs; that is, σ = (E),
where E is binary. This will simplify notation without any loss of generality.
Given a structure A and a ∈ A, its d-ball can be thought of as a sequence
of r-spheres, r ≤ d, where E-edges could go between Si(a) and Si+1(a), or
between two elements of the same sphere.
Let Q be a unary (FO+<)inv query on STRUCT[σ], deﬁned by a formula
ϕ(x) of quantiﬁer rank k. Fix a suﬃciently large d (exact bounds will be clear
5.3 Locality of Order-invariant FO 75
from the proof), and consider a ≈A
d b, with Bd(a) and Bd(b) disjoint. Let h
be an isomorphism h : Nd(a) → Nd(b).
We now ﬁx a linear ordering ≺a on Bd(a) such that dA(a, x) < dA(a, y)
implies x ≺a y. In particular, a is the smallest element with respect to ≺a.
We let ≺b be the image of ≺a under h. Let ≺0 be a ﬁxed linear ordering on
A − Bd(a, b). We now deﬁne a preorder ≺ as follows:
x ≺ y iﬀ x ≺a y, x, y ∈ Bd(a)
or x ≺b y, x, y ∈ Bd(b)
or h(x) ≺b y, x ∈ Bd(a), y ∈ Bd(b)
or x ≺b h(y), x ∈ Bd(b), y ∈ Bd(a)
or x ≺0 y, x, y ∈ Bd(a, b)
or x ∈ Bd(a, b), y ∈ Bd(a, b).
In other words, ≺ is a preorder that does not distinguish elements x and
h(x), but it makes both x and h(x) less than y and h(y) whenever x ≺a y
holds. Furthermore, each element of Bd(a, b) is less than each element of the
complement, A − Bd(a, b), which in turn is ordered by ≺0.
Our goal is to ﬁnd two linear orderings, ≤a and ≤b on A, such that
(A, a, ≤a) ≡k (A, b, ≤b). (5.2)
This would imply
a ∈ Q(A) iﬀ (A, ≤a) |= ϕ(a) iﬀ (A, ≤b) |= ϕ(b) iﬀ b ∈ Q(A). (5.3)
These orderings will be reﬁnements of ≺, and will be deﬁned sphere-by-sphere.
For the ≤a ordering, a is the smallest element, and for the ≤b ordering, b is the
smallest. On Sd(a)∪Sd(b), the orderings ≤a and ≤b must coincide (otherwise
the spoiler will win easily).
Note that ≺ is a preorder: the only pairs it does not order are pairs of
the form (x, h(x)). To deﬁne ordering on them, we select two “sparse” sets of
integers J = {j1, . . . , jm} and L = {l1, . . . , lm+1} with 0 < j1 < . . . < jm < d
and 0 < l1 < l2 < . . . < lm+1 < d. “Sparse” here means that the diﬀerence
between two consecutive integers is at least 2k
+ 1 (other conditions will be
explained in the detailed proof). Assume that x ∈ Sr(a), y ∈ Sr(b), and
y = h(x), for r ≤ d. Then
x ≤a y ⇔ |{j ∈ J | j < r}| is even,
y ≤a x ⇔ |{j ∈ J | j < r}| is odd,
(5.4)
and
x ≤b y ⇔ |{l ∈ L | l < r}| is odd,
y ≤b x ⇔ |{l ∈ L | l < r}| is even.
(5.5)
Thus, the parity of the number of ji’s or li’s below r tells us whether the order
on pairs (x, h(x)) prefers the element from Bd(a) or Bd(b). Note that a is the
76 5 Ordered Structures
least element with respect to ≤a (in particular, a ≤a b), and b is the least
element for ≤b, but since the number of switches of preferences diﬀers by one
for ≤a and ≤b, on Sd(a, b) both orderings are the same.
Of course a switch can be detected by a ﬁrst-order formula, but we have
many of them, and they happen at spheres that are well separated. The key
idea of the proof is to use the sparseness of J and L to show that the diﬀerence
between them cannot be detected by the spoiler in k moves. This will ensure
(A, a, ≤a) ≡k (A, b, ≤b).
We now present the complete proof; that is, we show how to construct two
orderings, ≤a and ≤b, such that (5.2) holds. First, we may assume, without
loss of generality, that no sphere Sr(a, b), r ≤ d, is empty. If any Sr(a, b) were
empty, A would have been a disjoint union of Bd(a), Bd(b), and A − Bd(a, b),
with no E-edges between these sets. Then, using NA
d (a) ∼= NA
d (b), it is easy to
ﬁnd orderings ≤a and ≤b such that (A, a, ≤a) and (A, b, ≤b) are isomorphic,
and hence (A, a, ≤a) ≡k (A, b, ≤b) holds.
To deﬁne the radius d for a given k (the quantiﬁer rank of a formula
deﬁning Q), we need some additional notation. Let σ(r) be the vocabulary
(E, <, U−r, U−r+1, . . . , U−1, U0, U1, . . . , Ur−1, Ur), where all the Ui’s are
unary. Let t be the number of rank-(k + 1) types of σ(r) structures, where
r = 2k
(this number, as we know, depends only on k).
Let Σ be a ﬁnite alphabet of cardinality t. Recall that a string s of length
n over Σ is represented as a structure Ms of the vocabulary (<, A1, . . . , At)
with the universe {1, . . ., n} ordered by <, and each unary Ai interpreted as
the set of positions between 1 and n where the symbol is the ith symbol of Σ.
We call a subset X = {x1, . . . , xp} of {1, . . ., n} r-sparse if
min
i=j
|xi − xj |> r, xi > r, n − xi > r, for all i ≤ p.
Next, we need the following lemma.
Lemma 5.10. For every t, k ≥ 0, there exists a number d > 0 such that,
given any string s ∈ Σ∗
of length n ≥ d, where |Σ |= t, there exist two subsets
J, L ⊆ {1, . . . , n} such that
• |L|=|J | +1 > 2k
;
• J and L are 2k
+ 1-sparse; and
• (Ms, J) ≡k+2 (Ms, L).
The proof is a standard Ehrenfeucht-Fra¨ıss´e game argument, and is left to
the reader as an exercise (Exercise 5.6).
We now let d be given by Lemma 5.10, for k the quantiﬁer rank of a formula
deﬁning Q, and t the number of rank-(k + 1) types of σ(2k)-structures.
Fix a ≈A
d b, with Bd(a) and Bd(b) disjoint, and let h be an isomorphism
Nd(a) → Nd(b). For i, r ≤ d, let Ri
r(a) be a σ(r)-structure whose universe is
the union
5.3 Locality of Order-invariant FO 77
i+r
j=i−r
Sj(a)
(if j < 0 or j > d, we take the corresponding sphere to be empty), and
each Up is interpreted as Si+p(a), and the ordering is ≺a, the ﬁxed linear
ordering on Bd(a) such that dA(x, a) < dA(y, a) implies x ≺a y (restricted
to the universe of the structure). Structures Ri
r(b) are deﬁned similarly, with
the ordering being ≺b, the image of ≺a under the isomorphism h. Note that
Ri
r(b) ∼= Ri
r(a).
Let Σ be the set of rank-(k +1) types of σ(2k)-structures. Deﬁne a string s
of length d + 1 which, in position i = 1, . . . , d + 1, has the rank-(k + 1) type of
Ri−1
2k (a). Applying Lemma 5.10, we get two 2k
+ 1-sparse sets J, L such that
(Ms, J) ≡k (Ms, L). Let J = {j1, . . . , jm} with j0 = 0 < j1 < . . . < jm < d
and L = {l1, . . . , lm+1} with l0 = 0 < l1 < l2 < . . . < lm+1 < d. Using these J
and L, deﬁne ≤a and ≤b as in (5.4) and (5.5).
Let Nd,J(a) and Nd,L(a) be two structures in the vocabulary (E, <, U, c)
with the universe Bd(a). In both, the binary predicate E is inherited from A,
the ordering < is ≺a, and the constant c is a. The only diﬀerence is the unary
predicate U: it is interpreted as j∈J Sj(a) in Nd,J (a), and as l∈L Sl(a) in
Nd,L(a).
Let Aa stand for (A, ≤a, a) and Ab for (A, ≤b, b). The winning strategy for
the duplicator on Aa and Ab is based on the following lemma.
Lemma 5.11. The duplicator has a winning strategy in the k-round game
on Nd,J(a) and Nd,L(a). Moreover, if p1, . . . , pk are the moves on Nd,J (a),
and q1, . . . , qk are the moves on Nd,L(a), then the following conditions can be
guaranteed by the winning strategy:
1. If pi ∈ Sr(a) and d − r ≤ 2k−i
, then qi = pi.
2. If (r1, . . . , rk) and (r′
1, . . . , r′
k) are such that each pi is in the sphere Sri (a)
and qi is in the sphere Sr′
i
(a), then ((r1, . . . , rk), (r′
1, . . . , r′
k)) deﬁne a partial
isomorphism between (Ms, J) and (Ms, L).
The idea of Lemma 5.11 is illustrated in Fig. 5.1. We have two structures,
(Ms, J) and (Ms, L), which are linear orders with extra unary predicates,
and two additional unary predicates, J and L of diﬀerent parity, which are
shown as short horizontal segments. Using the fact that (Ms, J) ≡k+2 (Ms, L),
we prove that Nd,J(a) ≡k Nd,L(a). These are shown in Fig. 5.1 as two big
circles, with concentric circles inside representing spheres Sr with r being in
J or L, respectively. These spheres form the interpretation for an extra unary
predicate in the vocabulary of structures Nd,J(a) and Nd,L(a).
Next, we show that Proposition 5.9 follows from Lemma 5.11; after that,
we prove Lemma 5.11.
78 5 Ordered Structures
a•
(Ms, J) (Ms, L)
≡k+2 ≡k
Nd,J (a) Nd,L(a)
•a
Fig. 5.1. Games involved in the proof of Proposition 5.9
From Nd,J(a) ≡k Nd,L(a) to Aa ≡k Ab. We now show how Lemma 5.11
implies Proposition 5.9; that is, Aa ≡k Ab. The idea for the winning strategy
on Aa and Ab is that it almost mimics the one in Nd,J(a) ≡k Nd,L(a).
We shall denote moves in Aa by a1, . . ., and moves in Ab by b1, . . .. Suppose
the spoiler plays ai ∈ Aa (the case of a move in Ab is symmetric). If ai ∈
Bd(a, b), then bi = ai, and we also set pi = qi = a.
If ai ∈ Bd(a, b), we deﬁne pi ∈ Bd(a) to be ai if ai ∈ Bd(a), and h−1
(ai)
if ai ∈ Bd(b). The duplicator then determines the response qi to pi, according
to the Nd,J (a) ≡k Nd,L(a) winning strategy. The response bi is going to be
either qi itself, or h(qi), and we use sets J and L to determine if bi lives in
Bd(a) or Bd(b).
We deﬁne two mappings
vJ : Bd(a, b) → {0, 1} and vL : Bd(a, b) → {0, 1}
such that for every x ∈ Bd(a),
vJ (x) + vJ (h(x)) = vL(x) + vL(h(x)) = 1.
For x ∈ Bd(a), ﬁnd r ≤ d such that x ∈ Sr(a). Then
vJ (x) =
0 if |{j ∈ J | j < r}| is even,
1 otherwise,
and vJ (h(x)) = 1 − vJ (x). Similarly, for x ∈ Bd(b), we ﬁnd r such that
x ∈ Sr(b) and set
vL(x) =
0 if |{l ∈ L | l < r}| is even,
1 otherwise,
and deﬁne vL(x) = 1 − vL(h(x)) for x ∈ Bd(a).
We now look at qi and h(qi); we know that vL(qi) + vL(h(qi)) = 1. We
choose bi to be one of qi or h(qi) such that vL(bi) = vJ (ai).
5.3 Locality of Order-invariant FO 79
This describes the strategy; now we prove that it works. Dealing with the
constant is easy: if the spoiler plays a in Aa, then the duplicator has to respond
with b in Ab and vice versa.
We now move to the E-relation. Since the parity of |J | and |L| is diﬀerent,
condition 1 of Lemma 5.11 implies that for any move in Bd(a, b)−Bd−2m (a, b)
with m moves to go, the response is the identity. Hence, if E(ai, aj) holds,
and one or both of ai, aj are outside of Bd(a, b), then E(bi, bj) holds (and vice
versa). Therefore, it suﬃces to consider the case when E(ai, aj) holds, and
ai, aj ∈ Bd(a, b).
Assume, without loss of generality, that ai, aj ∈ Bd(a). Then E(pi, pj)
holds, and hence E(qi, qj) holds. Given the duplicator’s strategy, to conclude
that E(bi, bj) holds, we must show that both bi and bj belong to the same
ball – Bd(a) or Bd(b).
The elements ai and aj could come either from the same sphere Sr(a), or
from two consecutive spheres Sr(a) and Sr+1(a). In the ﬁrst case, if they come
from the same sphere, vJ (ai) = vJ (aj) and thus vL(bi) = vL(bj). Furthermore,
since ai and aj are in the same sphere, we conclude that pi and pj are in the
same sphere, and hence, by the winning strategy of Lemma 5.11, qi and qj are
in the same sphere. This, together with vL(bi) = vL(bj), means that bi and bj
are in the same ball.
Assume now that ai ∈ Sr(a) and aj ∈ Sr+1(a). From condition 2 of Lemma
5.11, for some r′
≤ d we have qi ∈ Sr′ (a) and qj ∈ Sr′+1(a). Now there are
two cases. In the ﬁrst case, vJ (ai) = vJ (aj). Then there are two possibilities.
If r, r + 1 ∈ J, then r′
, r′
+ 1 ∈ L (by condition 2 of Lemma 5.11), and
hence vL(bi) = vJ (ai) = vJ (aj) = vL(bj) implies that bi, bj are in the same
ball, and E(bi, bj) holds. The other possibility is that r + 1 ∈ J, r ∈ J. Then
r′
+ 1 ∈ L, r′
∈ L, and again we conclude E(bi, bj).
The second case is when vJ (ai) = vJ (aj). This could only happen if r is
in J (and thus r + 1 ∈ J). Then again by condition 2 of Lemma 5.11, r′
∈
L, r′
+ 1 ∈ L. Suppose vJ (ai) = 0. Then vL(bi) = 0, and vL(bj) = vJ (aj) = 1.
Since bi ∈ Sr′ (a, b) and bj ∈ Sr′+1(a, b), and r′
∈ L, both bi and bj must
belong to the same ball (Bd(a) or Bd(b)), and hence E(bi, bj) holds.
Thus, E(ai, aj) implies E(bi, bj); the proof of the converse – that E(bi, bj)
implies E(ai, aj) – is identical.
Finally, assume that ai ≤a aj. If ai ∈ Sr(a, b), aj ∈ Sr′ (a, b) and r < r′
,
then, by condition 2 of Lemma 5.11, bi ∈ Sr0 (a, b), bj ∈ Sr′
0
(a, b) for some
r0 < r′
0, and hence bi ≤b bj.
Thus, it remains to consider the case of ai, aj being in the same sphere;
that is, ai, aj ∈ Sr(a, b). If pi = pj, then pi ≺a pj and hence qi ≺a pj, which
in turn implies bi ≤b bj. The ﬁnal possibility is that of pi = pj; then either
(1) ai ∈ Sr(a) and aj = h(ai), or (2) aj ∈ Sr(a) and ai = h(aj). We prove
case (1); the proof of case (2) is identical.
Note that the orderings ≤a and ≤b are deﬁned in such a way that whenever
x = h(y), then ≤a orders them according to vJ ; that is, if vJ (x) < vJ (y), then
x ≤a y, and if vJ (y) < vJ (x), then y ≤a x. The ordering ≤b behaves likewise
80 5 Ordered Structures
with respect to the function vL. Hence, if aj = h(ai) and ai ≤a aj, then
vJ (ai) = 0 and vJ (aj) = 1. From Lemma 5.11, qi = qj, and thus bi and bj are
related by the isomorphism h. Since vL(bi) = 0 and vL(bj) = 1, we know that
bi ≤b bj.
This concludes the proof that ai ≤a aj implies bi ≤b bj; the proof of the
converse is identical. Thus, we have proved, using Lemma 5.11, that Aa ≡k Ab,
which is precisely what is needed to conclude (weak) locality of Q. It thus
remains to prove Lemma 5.11.
Proof of Lemma 5.11. We shall refer to moves in the game on Nd,J (a) and
Nd,L(a) as pi (in Nd,J (a)) and qi (in Nd,L(a)), and to moves in the game on
(Ms, J) and (Ms, L), provided by Lemma 5.10, as ei for (Ms, J) and fi for
(Ms, L).
For two elements x, y in the universe of Ms (which is {1, . . . , d + 1}), the
distance between them is |x − y |. The next claim shows that after i rounds,
distances up to 2k−i
between played elements, and elements of the sets J and
L, are preserved.
Claim 5.12. Let e1, . . . , ei and f1, . . . , fi be elements played in the ﬁrst i
rounds of the game on (Ms, J) and (Ms, L). Then:
• if |ej1 − ej2 |≤ 2k−i
, then |fj1 − fj2 |=|ej1 − ej2 |;
• if |ej1 − ej2 |> 2k−i
, then |fj1 − fj2 |> 2k−i
;
• if min
x∈J,j≤i
|x − ej |≤ 2k−i
, then min
x∈J,j≤i
|x − ej |= min
y∈L,j≤i
|y − fj |;
• if min
x∈J,j≤i
|x − ej |> 2k−i
, then min
y∈L,j≤i
|y − fj |> 2k−i
.
Proof of Claim 5.12. Since we know that (Ms, J) ≡k+2 (Ms, L), it suﬃces to
show that for any x, y, p ≤ k, and any r ≤ 2p
, there is a formula of quantiﬁer
rank p + 1 that tests if | x − y |= r, and there is a formula of quantiﬁer
rank p + 2 that tests if the minimum distance from x to an element of the
set (interpreted as J and L in the models) is exactly r. We prove the ﬁrst
statement; the second is an easy exercise for the reader.
We deﬁne α0(x, y) ≡ (x = y); this tests if the distance is zero. To test if
the distance is one, we see if x is the successor of y or y is the successor of x:
α1(x, y) ≡ x < y ∧ ¬∃z (x < z ∧ z < y) ∨ y < x ∧ ¬∃z (y < z ∧ z < x) .
Now, suppose for each r ≤ 2p
, we have a formula αr(x, y) in FO[p + 1] testing
if the distance is r. We now show how to test distances up to 2p+1
using
FO[p + 2] formulae. Suppose 2p
< r ≤ 2p+1
. The formula αr is of the form
(x < y) ∧ α′
r(x, y) ∨ (y < x) ∧ α′′
r (x, y) . We present α′
r(x, y) below. Let
r1, r2 ≤ 2p
be such that r1 + r2 = r. Then
α′
r(x, y) ≡ ∃z (x < z) ∧ (z < y) ∧ αr1 (x, z) ∧ αr2 (z, y) .
5.3 Locality of Order-invariant FO 81
Clearly, this increases the quantiﬁer rank by 1. This proves the claim.
Given x ∈ Sr(a) and y ∈ Sr′ (a), deﬁne δ(x, y) as r−r′
. Given x1, . . . , xm in
Bd(a), and u ≥ 0, we deﬁne a structure Su[x1, . . . , xm] as follows. Its universe
is {x | −u ≤ δ(x, xi) ≤ u, i ≤ m}. It inherits binary relations E and ≺ from
Bd(a). Note that the universe of Su[x1, . . . , xm] is a union of spheres. Suppose
these are spheres Sr1 (a), . . . , Srw (a), with r1 < . . . < rw. Then the vocabulary
of Su[x1, . . . , xm] contains w unary predicates U1, . . . , Uw, interpreted as
Sr1 (a), . . . , Srw (a).
Furthermore, SJ
u[x1, . . . , xm] and SL
u [x1, . . . , xm] extend Su[x1, . . . , xm]
by means of an extra unary relation U interpreted as the union of spheres
Sri (a) with ri ∈ J (ri ∈ L, respectively).
We shall be interested in the parameter u of the form 2k−i
, i ≤ k, and now
deﬁne a relation SJ
2k−i [x1, . . . , xm] ∼k−i SL
2k−i [y1, . . . , ym]. The ﬁrst condition
is as follows:
If the universe of SJ
2k−i [x1, . . . , xm] is a union
of w spheres, Sr1 (a) ∪ . . . ∪ Srw (a), then the
universe of SL
2k−i [y1, . . . , ym] is a union of w
spheres, Sr′
1
(a) ∪ . . . ∪ Sr′
w
(a), and rj ∈ J iﬀ
r′
j ∈ L.
(5.6)
Deﬁne ∆u(r1, . . . , rw) as {j > 1 | rj+1 − rj > u}. The second condition is:
∆2k−i (r1, . . . , rw) = ∆2k−i (r′
1, . . . , r′
w). (5.7)
For 1 ≤ j < j′
≤ w + 1, deﬁne the restriction SJ
u[x1, . . . , xm]j′
j
to include only the spheres from Srj (a) up to Srj′−1
(a) (and likewise for
SL
u [y1, . . . , ym]j′
j ). The next condition is:
For each consecutive j, j′
∈ {1, w + 1} − ∆2k−i (r1, . . . , rw),
SJ
2k−i [x1, . . . , xm]j′
j ≡i SL
2k−i [y1, . . . , ym]j′
j .
(5.8)
We now write SJ
2k−i [x1, . . . , xm] ∼k−i SL
2k−i [y1, . . . , ym] if (5.6), (5.7), and
(5.8) hold.
Our goal is to show that the duplicator can play in such a way that, after
i moves,
SJ
2k−i [p0, p1, . . . , pi] ∼k−i SL
2k−i [q0, q1, . . . , qi], (5.9)
where p0 = q0 = a.
The proof is by induction on i. The case of i = 0 (i.e., SJ
2k−i [p0] ∼k
SL
2k−i [q0]) is immediate from the sparseness of J and L. We also set e0 =
f0 = 1.
Now suppose (5.9) holds, and the spoiler plays pi+1 ∈ Nd,J (a), such that
pi+1 ∈ Sr(a) (the case of the move qi+1 ∈ Nd,L(a) is symmetric). The duplicator
sets ei+1 ∈ {1, . . ., d+1} to be r +1, and ﬁnds the response fi+1 to ei+1
82 5 Ordered Structures
in the game on (Ms, J) and (Ms, L), from position ((e0, . . . , ei), (f0, . . . , fi)).
Let fi+1 = r′
+ 1; then the response qi+1 will be found in Sr′ (a).
Assume that SJ
2k−i [p0, p1, . . . , pi] is the union of spheres Sr1 (a) ∪ . . . ∪
Srw (a), and SL
2k−i [q0, q1, . . . , qi] is the union of spheres Sr′
1
(a) ∪ . . . ∪ Sr′
w
(a).
We distinguish two cases.
Case 1. In this case | δ(pi+1, pj) |> 2k−(i+1)
for all j ≤ w (i.e., |
ei+1 − ej |> 2k−(i+1)
). From Claim 5.12, we conclude | δ(qi+1, qj) |> 2k−(i+1)
for all j. Since ei+1 and fi+1 satisfy all the same unary predicates over
(Ms, J) and (Ms, L), we see that there is an element qi+1 in Sr′ (a) such
that S2k [pi+1] ≡k+1 S2k [qi+1] and hence
S2k−(i+1) [pi+1] ≡k−(i+1) S2k−(i+1) [qi+1].
Moreover, by Claim 5.12, r ± l ∈ J iﬀ r′
± l ∈ L, for every l ≤ 2k−(i+1)
, and
hence
SJ
2k−(i+1) [pi+1] ≡k−(i+1) SL
2k−(i+1) [qi+1].
From here
SJ
2k−(i+1) [p0, p1, . . . , pi+1] ∼k−(i+1) SL
2k−(i+1) [q0, q1, . . . , qi+1]
follows easily. This implies (5.8), and (5.6), (5.7) follow from the construction.
The ﬁnal note to make about this case is that if d − r ≤ 2k−(i+1)
, then qi+1
can be chosen to be equal to pi+1, while preserving (5.9).
Case 2. In this case | δ(pi+1, pj0 ) |≤ 2k−(i+1)
for some j0 ≤ w. Find two
consecutive j, j′
∈ ∆2k−i (r1, . . . , rw) such that pi+1 is in SJ
2k−i [p0, . . . , pi]j′
j .
From Claim 5.12, |δ(qi+1, qj)|≤ 2k−(i+1)
. We then use (5.8) and ﬁnd qi+1 in
Sr′ (a) so that
SJ
2k−i [p0, . . . , pi, pi+1]j′
j ≡k−(i+1) SL
2k−i [q0, . . . , qi, qi+1]j′
j . (5.10)
Conditions (5.6) and (5.7) for 2k−(i+1)
now follow from Claim 5.12, and condition
(5.8) then follows from (5.10), since for every sphere which is a part of one
of the structures mentioned in (5.10), there is a unary predicate interpreted
as that sphere.
Finally, if d + 1 − ei+1 ≤ 2k−(i+1)
, then d + 1 − ej0 ≤ 2k−i
, and thus
pj0 = qj0 and the structures SJ
2k−i [p0, . . . , pi]j′
j and SL
2k−i [q0, . . . , qi]j′
j are
actually isomorphic. Hence, responding to pi+1 with qi+1 = pi+1 will preserve
the isomorphism of structures of the form SJ
2k−(i+1) [p0, . . . , pi, pi+1]l′
l
and SL
2k−(i+1) [q0, . . . , qi, qi+1]l′
l containing the sphere with pi+1 = qi+1.
This ﬁnally shows that the duplicator plays in such a way that (5.9) is
preserved. After k moves, the moves of the game (p, q) form a partial isomorphism.
Indeed, if pi1 , pi2 are in diﬀerent structures SJ
1 [p]j′
j and SJ
1 [p]l′
l , then
qi1 , qi2 are in diﬀerent structures SL
1 [q]j′
j and SL
1 [q]l′
l , and hence there is no
5.5 Exercises 83
E-relation between them. Furthermore, since ei1 < ei2 iﬀ fi1 < fi2 , we see
that pi1 ≺ pi2 iﬀ qi1 ≺ qi2 . If pi1 , pi2 are in the same structure SJ
1 [p]j′
j , then
qi1 , qi2 are in SL
1 [q]j′
j , and hence by (5.8), the E and ≺ relations between them
are preserved. Finally, since ei ∈ J iﬀ fi ∈ L, we have pi ∈ U iﬀ qi ∈ U. This
shows that (p, q) is a partial isomorphism between Nd,J(a) and Nd,L(a), and
thus ﬁnishes the proof of Lemma 5.11 and Proposition 5.9.
5.4 Bibliographic Notes
While the concept of invariant queries is extremely important in ﬁnite model
theory, over arbitrary models it is not interesting, as Exercise 5.1 shows.
The separating example of Theorem 5.3 is due to Gurevich, although he
never published it (it appeared as an exercise in [3]). Another separating
example is given in Exercise 5.2.
Locality of invariant FO-deﬁnable queries is due to Grohe and Schwentick
[113]. Their original proof is the subject of Exercises 5.8 and 5.9; the proof
presented here is a slight simpliﬁcation of that proof. It uses the concept of
weak locality, introduced in Libkin and Wong [170].
Sources for exercises:
Exercise 5.1: Ebbinghaus and Flum [60]
Exercise 5.2: Otto [192]
Exercises 5.3 and 5.4: Libkin and Wong [170]
Exercises 5.7–5.9: Grohe and Schwentick [113]
Exercise 5.11: Rossman [210]
5.5 Exercises
Exercise 5.1. Prove that over arbitrary structures, FO = (FO+<)inv.
Hint: use the interpolation theorem.
Exercise 5.2. The goal of this exercise it to give another separation example for
FO (FO+ <)inv. We consider structures in the vocabulary σ = (U1, U2, E, R, S)
where U1, U2 are unary and E, R, S are binary. We consider a class C of structures
A ∈ STRUCT[σ] that satisfy the following conditions:
1. U1 and U2 partition the universe A.
2. E ⊆ U1 × U1 and S ⊆ U2 × U2.
3. The restriction of A to U2, S is a Boolean algebra (we refer to its set of atoms
as X).
4. |X |=|U1 |= 2m; moreover, if U1 = {u1, . . . , u2m} and X = {x1, . . . , x2m}, then
R =
m[
i=1
{u2i−1, u2i} × {x2i−1, x2i}.
84 5 Ordered Structures
First, prove that the class C is FO-deﬁnable. Next, consider the following Boolean
query Q on C:
Q(A) = true iﬀ U1, E is connected.
Prove that Q ∈ (FO+<)inv on C, but that Q is not FO-deﬁnable on C.
Exercise 5.3. Give an example of a query that is weakly local, but is not Gaifman-
local.
Exercise 5.4. Prove that weak locality implies the BNDP for binary queries. Does
this implication hold for m-ary queries, where m > 2?
Exercise 5.5. Using Proposition 5.9, prove that acyclicity and k-colorability are
not deﬁnable in FO+<.
Exercise 5.6. Prove Lemma 5.10.
Exercise 5.7. In the proof of weak locality of invariant queries presented in this
chapter, we only dealt with nonoverlapping neighborhoods. To deal with the case of
overlapping Bd(a) and Bd(b), prove the following.
Let d′
= 5d + 1, and let a ≈A
d′ b. Then there exists a set X containing {a, b} and
an automorphism g on NA
d (X) such that g(a) = b.
Exercise 5.8. Prove that every unary query in (FO+<)inv is Gaifman-local.
The main ingredients have already been presented in this chapter, but for the
case of nonoverlapping neighborhoods. To deal with the case of overlapping neighborhoods
Nd(a) and Nd(b), deﬁne d′
, g, and X as in Exercise 5.7.
Now note that each sphere Sr(X) is a union of g-orbits; that is, sets of the form
{gi
(v) | i ∈ Z}. For each orbit O, we ﬁx a node cO and deﬁne a linear ordering ≤0
on O by cO ≤0 g(cO) ≤0 g2
(cO) ≤0 . . .. Let ≤m be the image of ≤0 under gm
.
The deﬁnition of ≤a and ≤b is almost the same as the deﬁnition we used in the
proof of Proposition 5.9. We start with a ﬁxed order on orbits that respects distance
from X. It generates a preorder on Bd(X), which we reﬁne to two diﬀerent orders
in the following way. On S0(X), we let ≤a be ≤0 and ≤b be ≤1= g(≤0). Then, for
suitably deﬁned J and L (cf. the proof of Proposition 5.9), we do the following. Let
J = {j1, . . . , jm}, j1 < . . . < jm. For all spheres Sr(X), r < j1, the order on each
orbit is ≤0, but on Sj1 (X) we use ≤1 instead. We continue to use ≤1 until Sj2−1(X),
and on Sj2 (X) we switch to ≤2, and so on. For ≤b, we do the same, except that we
use the set L instead. We choose J and L so that | J |=| L | +1, which means that
on Sd(X), both ≤a and ≤b coincide.
The goal of the exercise is then to turn this sketch (together with the proof of
Proposition 5.9) into a proof of locality of unary queries in (FO+<)inv.
Exercise 5.9. The goal of this exercise is to complete the proof of Theorem 5.8.
Using Exercise 5.8, show that every m-ary query in (FO+ <)inv, for m > 1, is
Gaifman-local.
Exercise 5.10. Calculate the locality rank of an order-invariant query produced in
the proof of Theorem 5.8. You will probably have to use Exercise 3.10.
5.5 Exercises 85
Exercise 5.11. We know that FO (FO+ <)inv. What about (FO + Succ)inv?
Clearly
FO ⊆ (FO + Succ)inv ⊆ (FO+<)inv,
and at least one containment must be proper. Find the exact relationship between
these three classes of queries.
Exercise 5.12.∗
Consider again the vocabulary σ<,+ and a class C<,+ of σ<,+structures
where < is interpreted as a linear ordering, and + as the addition corresponding
to <. Prove that every query in (FO + C<,+)inv is local.
6
Complexity of First-Order Logic
The goal of this chapter is to study the complexity of queries expressible
in FO. We start with the general deﬁnition of diﬀerent ways of measuring
the complexity of a logic over ﬁnite structures: these are data, expression, and
combined complexity. We then connect FO with Boolean circuits and establish
some bounds on the data complexity. We also consider the issue of uniformity
for a circuit model, and study it via logical deﬁnability. We then move to
the combined complexity of FO, and show that it is much higher than the
data complexity. Finally, we investigate an important subclass of FO queries
– conjunctive queries – which play a central role in database theory.
6.1 Data, Expression, and Combined Complexity
Let us ﬁrst consider the complexity of the model-checking problem: that is,
given a sentence Φ in a logic L and a structure A, does A satisfy Φ? There
are two parameters of this question: the sentence Φ, and the structure A.
Depending on which of them are considered parameters of the problem, and
which are ﬁxed, we get three diﬀerent deﬁnitions of complexity for a logic.
Complexity theory deﬁnes its main concepts via acceptance of string languages
by computational devices such as Turing machines. To talk about
complexity of logics on ﬁnite structures, we need to encode ﬁnite structures
and logical formulae as strings. For formulae, we shall assume some natural
encoding: for example, enc(ϕ), the encoding of a formula ϕ, could be its syntactic
tree (represented as a string). For the notion of data complexity, deﬁned
below, the choice of a particular encoding of formulae does not matter.
There are several diﬀerent ways to encode structures. The one we use here
is the one most often used, but others are possible, and sometimes provide
additional useful information about the running time of query-evaluation al-
gorithms.
Suppose we have a structure A ∈ STRUCT[σ]. Let A = {a1, . . . , an}. For
encoding a structure, we always assume an ordering on the universe. In some
88 6 Complexity of First-Order Logic
structures, the order relation is a part of the vocabulary; in others, it is not,
and then we arbitrarily choose one. The order in this case will have no eﬀect
on the result of queries, but we need it to represent the encoding of a structure
on the tape of a Turing machine, to be able to talk about computability and
complexity of queries.
Thus, we choose an order on the universe, say, a1 < a2 < . . . < an. Each
k-ary relation RA
will be encoded by an nk
-bit string enc(RA
) as follows.
Consider an enumeration of all k-tuples over A, in the lexicographic order
(i.e., (a1, . . . , a1), (a1, . . . , a1, a2), . . . , (an, . . . , an, an−1), (an, . . . , an)). Let aj
be the jth tuple in this enumeration. Then the jth bit of enc(RA
) is 1 if
aj ∈ RA
, and 0 if aj ∈ RA
. We shall assume without any loss of generality
that σ contains only relation symbols, since a constant can be encoded as a
unary relation containing one element.
If σ = {R1, . . . , Rp}, then the basic encoding of a structure is the concatenation
of the encodings of relations: enc(RA
1 ) · · · enc(RA
p ). In some computational
models (e.g., circuits), the length of the input is a parameter of
the model and thus |A| can easily be calculated from the basic encoding; in
others (e.g., Turing machines), |A| must be known by the device in order to
use the encoding of a structure. For that purpose, we deﬁne an enc(A) which
is simply the concatenation of 0n
1 and all the enc(RA
i )’s:
enc(A) = 0n
1 · enc(RA
1 ) · · · enc(RA
p ). (6.1)
The length of this string, denoted by A , is
A = (n + 1) +
p
i=1
narity(Ri)
. (6.2)
Deﬁnition 6.1. Let K be a complexity class, and L a logic. We say that
• the data complexity of L is K if for every sentence Φ of L, the language
{enc(A) | A |= Φ}
belongs to K;
• the expression complexity of L is K if for every ﬁnite structure A, the
language
{enc(Φ) | A |= Φ}
belongs to K; and
• the combined complexity of L is K if the language
{(enc(A), enc(Φ)) | A |= Φ}
belongs to K.
6.2 Circuits and FO Queries 89
• Furthermore, we say that the combined complexity of L is hard for K (or
K-hard) if the language {(enc(A), enc(Φ)) | A |= Φ} is a K-hard problem.
The data complexity is K-hard if for some Φ, {enc(A) | A |= Φ} is a hard
problem for K, and the expression complexity is K-hard if for some A,
{enc(Φ) | A |= Φ} is K-hard.
• A problem that is both in K and K-hard is complete for K, or K-complete.
Thus, we can talk about data/expression/combined complexity being K-
complete.
Given our standard choice of encoding, we shall sometimes omit the notation
enc(·), instead writing {A | A |= Φ} ∈ K, etc.
The notion of data complexity is most often used in the database context:
the structure A corresponds to a large relational database, and the much
smaller sentence Φ is a query that has to be evaluated against A; hence Φ is
ignored in this deﬁnition. The notions of expression and combined complexity
are often used in veriﬁcation and model-checking, where a complex speciﬁcation
needs to be evaluated on a description of a ﬁnite state machine; in this
case the speciﬁcation Φ may actually be more complex than the structure
A. We shall also see that for most logics of interest, all the hardness results
for the combined complexity will be shown on very simple structures, thereby
giving us matching bounds for the combined and expression complexity. Thus,
we shall concentrate on the data and combined complexity.
We deﬁned the notion of complexity for sentences only. The notion of data
complexity has a natural extension to formulae with free variables deﬁning
non-Boolean queries. Suppose an m-ary query Q is deﬁnable by a formula
ϕ(x1, . . . , xm). Then the data complexity of Q is the complexity of the language
{(enc(A), enc({a})) | a ∈ Q(A)}. This is the same as the data complexity
of the sentence (∃!x S(x)) ∧ (∀x (S(x) → ϕ(x))), where S is a new
m-ary relation symbol not in σ (we assume that the logic L is closed under the
Boolean connectives and ﬁrst-order quantiﬁcation). Recall that the quantiﬁer
∃!x means “there exists a unique x”. Thus, as long as L has the right closure
properties, we can only consider data complexity with respect to sentences.
6.2 Circuits and FO Queries
In this section we show how to code FO sentences over ﬁnite structures by
Boolean circuits. This coding will give us bounds for both the data and combined
complexity of FO.
Deﬁnition 6.2. A Boolean circuit with n inputs x1, . . . , xn is a tuple
C = (V, E, λ, o),
where
90 6 Complexity of First-Order Logic
1. (V, E) is a directed acyclic graph with the set of nodes V (which we call
gates) and the set of edges E.
2. λ is a function from V to {x1, . . . , xn} ∪ {∨, ∧, ¬} such that:
• λ(v) ∈ {x1, . . . , xn} implies that v has in-degree 0;
• λ(v) = ¬ implies that v has in-degree 1.
3. o ∈ V .
The in-degree of a node is called its fan-in. The size of C is the number of
nodes in V ; the depth of C is the length of the longest path from a node of
in-degree 0 to o.
A circuit C computes a Boolean function with n inputs x1, . . . , xn as follows.
Suppose we are given values of x1, . . . , xn. Initially, we compute the
values associated with each node of in-degree 0: for a node labeled xi, it is
the value of xi; for a node labeled ∨ it is false; and for a node labeled ∧ it is
true. Next, we compute the value of each node by induction: if we have a node
v with incoming edges from v1, . . . , vl, and we know the values of a1, . . . , al
associated with v1, . . . , vl, then the value a associated with v is:
• a1 ∨ . . . ∨ al if λ(v) = ∨;
• a1 ∧ . . . ∧ al if λ(v) = ∧;
• ¬a1 if λ(v) = ¬ (in this case we know that l = 1).
The output of the circuit is the value assigned to the node o. An example
of a circuit computing the Boolean function x1 ∧ ¬x2 ∧ x3 ∨ ¬ x3 ∧ ¬x4 is
shown in Fig. 6.1; the output node is depicted as a double circle.
Note that a circuit with no inputs is possible, and its in-degree zero gates
are labeled ∨ or ∧. Such a circuit always outputs a constant (i.e., true or
false).
We next deﬁne families of circuits and languages in {0, 1}∗
they accept.
Deﬁnition 6.3. A family of circuits is a sequence C = (Cn)n≥0 where each
Cn is a circuit with n inputs. It accepts the language L(C) ⊆ {0, 1}∗
deﬁned
as follows. Let s be a string of length n. It can be viewed as a Boolean vector
xs such that the ith component of xs is the ith symbol in s. Then s ∈ L(C)
iﬀ Cn outputs 1 on xs.
A family of circuits C is said to be of polynomial size if there is a polynomial
p : N → N such that the size of each Cn is at most p(n). For a function
f : N → N, we say that C is of depth f(n) if the depth of Cn is at most f(n).
We say that C is of constant depth if there is d > 0 such that for all n, the
depth of Cn is at most d.
The class of languages accepted by polynomial-size constant-depth families
of circuits is called nonuniform AC0
.
6.2 Circuits and FO Queries 91
6
6
6
6
6
6
 I 
*
x1 x2 x3 x4
¬ ¬
∧ ∧
¬
∨
Fig. 6.1. Boolean circuit computing (x1 ∧ ¬x2 ∧ x3) ∨ ¬(x3 ∧ ¬x4)
For example, the language that consists of strings containing at least two
ones is in nonuniform AC0
: each circuit Cn, n > 1, has ∧-gates for every pair
of inputs xi and xj, and then the outputs of those ∧-gates form the input for
one ∨-gate.
A class of structures C ⊆ STRUCT[σ] is in nonuniform AC0
if so is the
language {enc(A) | A ∈ C}. An example of a class of structures that is not
FO-deﬁnable, but belongs to nonuniform AC0
, is the class even of structures
of the empty vocabulary: that is, { A, ∅ | | A | mod 2 = 0}. The coding of
such a structure with |A|= n is simply 0n
1; hence Ck always returns true for
odd k (as it corresponds to structures of even cardinality), and false for even
k.
Next, we extend FO as follows. Let P be a collection, ﬁnite or inﬁnite, of
numerical predicates; that is, subsets of Nk
. For example, they may include <,
+ considered as a ternary predicate {(i, j, l) | i + j = l}, etc. For P including
the linear order, we deﬁne FO(P) as an extension of FO with atomic formulae
of the form P(x1, . . . , xk), for a k-ary P ∈ P. The semantics is deﬁned as
follows. Suppose A is a σ-structure, and its universe A is ordered by < as
a0 < . . . < an−1. Then A |= P(ai1 , . . . , aik
) iﬀ the tuple of numbers (i1, . . . , ik)
belongs to P.
For example, let P2 ⊂ N consist of the even numbers. Then the query
even is expressed as an FO({<, P2}) sentence as follows:
∀x ∀y (y ≤ x) → P2(x) .
We are now interested in the class FO(All) where All stands for the family
of all numerical predicates; that is, all subsets of N, N2
, N3
, etc. We now show
the connection between FO(All) and nonuniform AC0
.
92 6 Complexity of First-Order Logic
Theorem 6.4. Let C be a class of structures deﬁnable by an FO(All) sentence.
Then C is in nonuniform AC0
. That is,
FO(All) ⊆ nonuniform AC0
.
Furthermore, for every FO(All) sentence Φ, there is a family of circuits of
depth O( Φ ) accepting {A | A |= Φ}.
Proof. We describe each circuit Ck in the family C accepting {A | A |= Φ}. If
k is not of the form A for some structure A, then Ck always returns false.
Assume k is given by (6.2); that is, k is the size of the encodings of structures A
with an n-element universe. We then convert Φ into a quantiﬁer-free sentence
Φ′
over the vocabulary σ, predicate symbols in All, and constants 0, . . . , n − 1
as follows. Inductively, we replace each quantiﬁer ∃xϕ(x, y) or ∀xϕ(x, y) with
n−1
c=0
ϕ(c, y) and
n−1
c=0
ϕ(c, y),
respectively. Notice that the number of connectives ∨, ∧, ¬, , in Φ′
is exactly
the same as the number of connectives ∨, ∧, ¬ and quantiﬁers ∃, ∀ in
Φ.
We now build the circuit to evaluate Φ′
. Note that Φ′
is a Boolean combination
(using connectives ∨, ∧, ¬, , ) of formulae of the form P(i1, . . . , ik),
where P is a numerical predicate, and R(i1, . . . , im), where R is an m-ary
symbol in σ. The former is replaced by its truth value (which is either a ∨ or
a ∧ gate with zero inputs), and the latter corresponds to one bit in enc(A);
that is, the input of the circuit. The depth of the resulting circuit is bounded
by the number of connectives ∨, ∧, ¬, , in Φ′
, and hence depends only on
Φ, and not on k. The size of the circuit Ck is clearly polynomial in k, which
completes the proof.
Corollary 6.5. The data complexity of FO(All) is nonuniform AC0
.
We conclude this section with another bound on the complexity of FO
queries. This time we determine the running time of such a query in terms of
the sizes of encodings of a query and a structure.
Given an FO formula ϕ, its width is the maximum number of free variables
in a subformula of ϕ.
Proposition 6.6. Let Φ be an FO sentence in vocabulary σ, and let A ∈
STRUCT[σ]. If the width of Φ is k, then checking whether A |= Φ can be done
in time
O( Φ × A
k
).
Proof. Assume, without loss of generality, that Φ uses ∧, ¬, and ∃ but not ∨
and ∀. Let ϕ1, . . . , ϕm enumerate all the subformulae of Φ; we know that they
6.3 Expressive Power with Arbitrary Predicates 93
contain at most k free variables. We now inductively construct ϕi(A). If ϕi has
ki free variables, then ϕi(A) ⊆ Aki
. It will be represented by a Boolean vector
of length nki
, where n =|A|, in exactly the same way as we code relations in
A.
If ϕi is an atomic formula R(x1, . . . , xki ), then ϕi(A) is simply the encoding
of R in enc(A). If ϕi is ¬ϕj(A), we simply ﬂip all the bits in the representation
of ϕj(A). If ϕi is ϕj ∧ ϕl, there are two cases. If the free variables of ϕj and
ϕl are the same, then ϕi(A) is obtained as the bit-wise conjunction of ϕj(A)
and ϕl(A). Otherwise, ϕi(x, y, z) = ϕj(x, y) ∧ ϕl(x, z), and ϕi(A) is the join
of ϕj(A) and ϕl(A), obtained by ﬁnding, for all tuples over a ∈ A|x|
, tuples
b ∈ A|y|
and c ∈ A|z|
such that the bits corresponding to (a, b) in ϕj(A) and to
(a, c) in ϕl(A) are set to 1, and then setting the bit corresponding to (a, b, c)
in ϕi(A) to 1. Finally, if ϕi(x) = ∃zϕj(z, x), we simply go over ϕj(A), and if
the bit corresponding to (a, a) is set to 1, then we set the bit corresponding
to a in ϕi(A) to 1.
The reader can easily check that the above algorithm can be implemented
in time O( Φ × A
k
), since none of the formulae ϕi has more than k free
variables.
6.3 Expressive Power with Arbitrary Predicates
In the previous section, we introduced a powerful extension of FO – the logic
FO(All). Since this logic can use arbitrary predicates on the natural numbers,
it can express noncomputable queries: for example, we can test if the size of
the universe of A is a number n which codes a pair (k, m) such that the kth
Turing machine halts on the mth input (assuming some standard enumeration
of Turing machines and their inputs). Nevertheless, we can prove some
strong bounds on the expressiveness of FO(All): although we saw that even
is FO(All)-expressible, it turns out that the closely related query, parity, is
not.
Recall that parityU is a query on structures whose vocabulary σ contains
one unary relation symbol U. Then
parityU (A) ⇔ |UA
| mod 2 = 0.
We shall omit the subscript U if it is understood from the context.
To show that parity is not FO(All)-expressible, we consider the Boolean
function parity with n arguments (for each n) deﬁned as follows:
parity(x1, . . . , xn) =
1 if |{i | xi = 1}| mod 2 = 0,
0 otherwise.
We shall need the following deep result in circuit complexity.
94 6 Complexity of First-Order Logic
Theorem 6.7 (Furst-Saxe-Sipser, Ajtai). There is no constant-depth
polynomial-size family of circuits that computes parity.
Corollary 6.8. parity is not expressible in FO(All).
Proof. Assume, to the contrary, that parity is expressible. By Theorem 6.4,
there is a polynomial-size constant-depth circuit family C that computes
parity on encodings of structures. Such an encoding of a structure A with
|A|= n is 0n
1 · s, where s is the string of length n whose ith element is 1 iﬀ
the ith element of A is in UA
.
We now use C to construct a new family of circuits deﬁning parity. The
circuit with n inputs x1, . . . , xn works as follows. For each xi, it adds an indegree
0 gate gi labeled ∨, and for xn it also adds an in-degree 0 gate g′
n
labeled ∧. Then it puts C2n+1, the circuit with 2n + 1 inputs from C on the
outputs of g1, . . . , gn, g′
n followed by x1, . . . , xn, as shown below:
6 6 6 6 6
6
C2n+1
... .........∨ ∨ ∧ x1 xn
Clearly this circuit computes parity(x1, . . . , xn), and by Theorem 6.4 the resulting
family of circuits is of polynomial size and bounded depth. This contradicts
Theorem 6.7.
As another example of inexpressibility in FO(All), we show the following.
Corollary 6.9. Graph connectivity is not expressible in FO(All).
Proof. We shall follow the idea of the proof of Corollary 3.19; however, that
proof used inexpressibility of the query even, which of course is deﬁnable in
FO(All). We modify the proof to make use of Corollary 6.8 instead.
First, we show that for a graph G = (V, E), where E is a successor relation
on a set U ⊆ V of nodes, FO(All) cannot test if the cardinality of U is
even. Indeed, suppose to the contrary that it can; then this can be done
in nonuniform AC0
, by a family of circuits C. We now show how to use C
to test parity. Suppose an encoding 0n
1 · s of a unary relation U is given,
where U = {i1, . . . , ik} ⊆ {1, . . . , n}. We transform U into a successor relation
SU = {(i1, i2), . . . , (ik−1, ik)}. We leave it to the reader to show how to use
bounded-depth circuits to transform 0n
1 · s into 0n
1 · s′
where s′
of length n2
codes SU . Then using the circuit Cn2+n+1 from C on 0n
1 · s′
we can test if U
is even.
Finally, using inexpressibility of parity of a successor relation, we show
inexpressibility of connectivity in FO(All) using the same proof as in Corollary
3.19.
6.4 Uniformity and AC0
95
6.4 Uniformity and AC0
We have noticed that nonuniform AC0
is not truly a complexity class: in fact,
the function that computes the circuit Cn from n need not even be recursive.
It is customary to impose some uniformity conditions that postulate how Cn is
obtained. While it is possible to formulate these conditions purely in terms of
circuits, we prefer to follow the logic connection, and instead put restrictions
on the choice of available predicates in FO(All).
We now associate a ﬁnite n-element universe of a structure with the set
{0, . . . , n−1}, and consider an extension of FO over σ-structures by adding two
ternary predicates, +++ and ×××, which are graphs of addition and multiplication.
That is,
+++ = {(i, j, k) | i + j = k} and ××× = {(i, j, k) | i · j = k}.
Note that we have to use +++ and ××× as ternary relations rather than binary
functions, to ensure that the result of addition or multiplication is always in
the universe of the structure. The resulting logic is denoted by FO(+++,×××).
Deﬁnition 6.10. The class of structures deﬁnable in FO(+++,×××) is called uniform
AC0
.
We shall normally omit the word uniform; hence, by referring to just AC0
,
we mean uniform AC0
. Note that many examples of AC0
queries seen so far
only use the standard arithmetic on the natural numbers; for example, even
is in AC0
.
It turns out that AC0
is quite powerful and can deﬁne several interesting
numerical relations on the domain {0, . . ., n−1}. One of them, which we shall
see quite often, is the bit relation:
BIT(x, y) is true ⇔ the yth bit of the binary expansion of x is 1.
For example, the binary expansion of x = 18 is 10010, and hence BIT(x, y)
is true if y is 1 or 4, and BIT(x, y) is false if y is 0, 2, or 3.
We now start building the family of functions deﬁnable in FO(+++,×××).
Whenever we say that a k-ary function is deﬁnable, we actually mean that
the graph of this function, a k + 1-ary relation, is deﬁnable. However, to
make formulae more readable, we often use functions instead of their graphs.
First, we note that the linear order is deﬁnable by x ≤ y ⇔ ∃z +++(x, z, y) (i.e.,
∃z (x+z = y)), and thus the minimum element 0, and the maximum element,
denoted by max, are deﬁnable.
Lemma 6.11. The integer division ⌊x/y⌋ and (x mod y) are deﬁnable in
FO(+++,×××).
96 6 Complexity of First-Order Logic
Proof. If y = 0, then
u = ⌊x/y⌋ ⇔ (u · y) ≤ x ∧ (∃v < y (x = u · y + v)) .
Furthermore,
u = (x mod y) ⇔ ∃v (v = ⌊x/y⌋) ∧ (u + y · v = x) .
In particular, we can express divisibility x | y as (x mod y) = 0.
Now our goal is to show the following.
Theorem 6.12. BIT is expressible in FO(+++,×××).
Proof. We shall prove this in several stages. First, note that the following tests
if x is a power of 2:
pow2(x) ≡ ∀u, v (x = u · v) ∧ (v = 1) → ∃z (v = z + z) .
This is because pow2(x) asserts that 2 is the only prime factor of x. Next, we
deﬁne the predicate
BIT′
(x, y) ≡ (⌊x/y⌋ mod 2) = 1.
Note that if y = 2z
, then BIT′
(x, y) is true iﬀ the zth bit of x is 1. Assume
that we can deﬁne the predicate y = 2z
. Then
BIT(x, y) ≡ ∃u u = 2y
∧ BIT′
(x, u) .
Thus, it remains to show how to express the binary predicate x = 2y
. We do
so by coding an iterative computation of 2y
. The codes of such computations
will be numbers, and as we shall see, those numbers can be as large as x4
.
Since we only quantify over {0, . . . , n − 1}, where n is the size of the ﬁnite
structure, we show below how to express the predicate
P2(x, y) ≡ x = 2y
∧ x4
≤ n − 1.
With P2, we can deﬁne x = 2y
as follows:
∃u∃v




y = 4v ∧ P2(u, v) ∧ x = u4
∨ y = 4v + 1 ∧ P2(u, v) ∧ x = 2 · u4
∨ y = 4v + 2 ∧ P2(u, v) ∧ x = 4 · u4
∨ y = 4v + 3 ∧ P2(u, v) ∧ x = 8 · u4



 .
We now show how to express P2(x, y). Let y =
k−1
i=0 yi · 2i
, so that y is
yk−1yk−2 . . . y1y0 in binary (we assume that the most signiﬁcant bit yk−1 is
1). Then 2y
=
k−1
i=0 2yi·2i
. We now deﬁne the following recurrences for i < k:
p0 = 1 a0 = 0 b0 = 1
pi+1 = 2pi ai+1 = ai + yi · 2i
bi+1 = bi · 2yi·2i
6.4 Uniformity and AC0
97
Thus, pi = 2i
, ai is the number whose binary representation is yi−1 . . . y0,
and bi = 2ai
. We deﬁne sequences p = (p0, . . . , pk), a = (a0, . . . , ak), b =
(b0, . . . , bk).
Next, we explain how to code these sequences. Notice that in all three
of them, the ith element needs at most 2i
bits to be represented in binary.
Suppose we have an arbitrary sequence c = (c0, . . . , ck), where each ci has
at most 2i
bits in binary. Such a sequence will be coded by a number c such
that its 2i
bits from 2i
to 2i+1
− 1 form the binary representation of ci. These
codes, when applied to p, a, and b, result in numbers p, a, and b, respectively.
These numbers turn out to be relatively small. Since the length of the binary
representation of y is k, we know that y ≥ 2k−1
. If x = 2y
, then x ≥ 22k−1
and x4
≥ 22k+1
. The binary representation of p, a, and b has at most 2k+1
− 1
bits, and hence the maximum value of those codes is 22k+1
−1
− 1, which is
bounded above by x4
. Hence, for deﬁning P2, codes of all the sequences will
be bounded by the size of the universe.
How can one extract numbers ci from the code c of c ? Notice that
⌊x/22i
⌋ mod 22i
is ci. In general, we deﬁne extract(x, u) ≡ ⌊x/u⌋ mod u, and thus ci =
extract(c, 22i
). Notice that since (22i
)2
= 22i+1
, for u = 22i
we have ci =
extract(c, u) and ci+1 = extract(c, u2
).
Assume now that we have an extra predicate ppow2(u) which holds iﬀ u
is of the form 22i
. With this, we express P2(x, y) by stating the existence of
a, b, p (coding a, b, p) such that:
• extract(p, 2) = 1, extract(a, 2) = 0, extract(b, 2) = 1 (the initial conditions
of the recurrences hold).
• If u < x and ppow2(u), then extract(p, u2
) = 2 · extract(p, u) (the recurrence
for p is correct).
• If u < x and ppow2(u), then either
1. extract(a, u2
) = extract(a, u) and extract(b, u2
) = extract(b, u), or
2. extract(a, u2
) = extract(a, u) + extract(p, u) and extract(b, u2
) = u ·
extract(b, u).
That is, the recurrences for a and b are coded correctly: the ﬁrst case
corresponds to yi = 0, and hence ai+1 = ai and bi+1 = bi; the second case
corresponds to yi = 1, and hence ai+1 = ai +pi and bi+1 = bi ·22i
= bi ·u.
• There is u such that ppow2(u) holds, extract(a, u) = y, and
extract(b, u) = x. That is, the sequences show that 2y
= x.
Clearly, the above can be expressed as an FO formula.
98 6 Complexity of First-Order Logic
All that remains is to show how to express the predicate ppow2(u). This
in turn is done in two steps. First, we deﬁne a predicate P1(v) that holds iﬀ
v is of the form
s
i=1 22i
(i.e., in its binary representation, ones appear only
in positions corresponding to powers of 2). With this predicate, we deﬁne
ppow2(u) ≡ pow2(u) ∧ ∃w P1(w) ∧ BIT′
(w, u).
Note that if ppow2(u) holds, one can ﬁnd w with BIT′
(w, u) such that w ≤ u2
;
given that all numbers for which ppow2(·) is checked are below 4
√
n − 1, the
∃w is guaranteed to range over the ﬁnite universe.
To express P1, we need an auxiliary formula pow4(u) = pow2(u) ∧
(u mod 3 = 1) testing if u is a power of 4. Now P1(u) is the conjunction
of ¬BIT′
(u, 1) ∧ BIT′
(u, 2) and the following formula:
∀v 2 < v ≤ u → BIT′
(u, v) ↔ (pow4(v)∧∃w [(w ·w = v)∧BIT′
(u, w)]) .
This formula states that 1-bits in the binary representation of u are 2 and
others given by the sequence e1 = 2, e2 = 4, . . . , ei+1 = e2
i ; that is, bits in
positions of the form 22i
. This deﬁnes P1, and thus completes the proof of the
theorem.
The BIT predicate turns out to be quite powerful. First note the following.
Lemma 6.13. Addition is deﬁnable in FO(<, BIT).
Proof. We use the standard carry-lookahead algorithm. Given x, y, and u, we
deﬁne carry(x, y, u) to be true if, while adding x, y given as binary numbers,
the carry bit with number u is 1:
∃v
v < u ∧ BIT(x, v) ∧ BIT(y, v)
∧ ∀w (w < u ∧ w > v) → (BIT(x, w) ∨ BIT(y, w))
.
Then x + y = z iﬀ
∀u BIT(z, u) ↔ (BIT(x, u) ⊕ BIT(y, u)) ⊕ carry(x, y, u) ,
where ϕ ⊕ ψ is an abbreviation for ϕ ↔ ¬ψ.
A more complicated result (see Exercise 6.5) states the following.
Lemma 6.14. Multiplication is deﬁnable in FO(<, BIT).
We thus obtain:
Corollary 6.15. FO(<, BIT) = FO(+++,×××).
Hence, uniform AC0
can be characterized as the class of structures deﬁnable
in FO(<, BIT).
6.6 Parametric Complexity and Locality 99
6.5 Combined Complexity of FO
We have seen that the data complexity of FO(All) is nonuniform AC0
, and
the data complexity of FO is AC0
. What about the combined and expression
complexity of FO? It turns out that they belong to a much larger class than
AC0
.
Theorem 6.16. The combined complexity of FO is Pspace-complete.
Proof. The membership in Pspace follows immediately from the evaluation
method used in the proof of Proposition 6.6. To show hardness, recall the
problem QBF, satisﬁability of quantiﬁed Boolean formulae.
Problem: QBF
Input: A formula Φ = Q1x1 . . . Qnxn α(x1, . . . , xn), where:
each Qi is either ∃ or ∀, and
α is a propositional formula in x1, . . . , xn.
Question: If all xi’s range over {true, false}, is Φ true?
It is known that QBF is Pspace-hard (see the bibliographic notes at the
end of the chapter). We now prove Pspace-hardness of FO by reduction from
QBF.
Given a formula Φ = Q1x1 . . . Qnxn α(x1, . . . , xn), construct a structure
A whose vocabulary includes one unary relation U as follows: A = {0, 1},
and UA
= {1}. Then modify α by changing each occurrence of xi to U(xi),
and each occurrence of ¬xi to ¬U(xi). Let αU
be the resulting formula. For
example, if α(x1, x2, x3) = (x1 ∧x2)∨(¬x1 ∧x3), then αU
is (U(x1)∧U(x2))∨
(¬U(x1) ∧ U(x3)). Then
Φ is true ⇔ A |= Q1x1 . . . Qnxn αU
(x1, . . . , xn),
which proves Pspace-hardness.
Since the structure A constructed in the proof of Theorem 6.16 is ﬁxed,
we obtain:
Corollary 6.17. The expression complexity of FO is Pspace-complete.
For most of the logics we study, the expression and combined complexity
coincide; however, this need not be the case in general.
6.6 Parametric Complexity and Locality
Proposition 6.6 says that checking whether A |= Φ can be done in time
O( Φ · A k
), where k is the width of Φ: the maximum number of free
variables of a subformula of Φ. In particular, this gives a polynomial time
100 6 Complexity of First-Order Logic
algorithm for evaluating FO queries on ﬁnite structures, for a ﬁxed sentence
Φ. Although polynomial time is good, in many cases it is not suﬃcient: for
example, in the database context where A is very large, even for small k
the running time O( A
k
) may be prohibitively expensive (in fact, the goal
of most join algorithms in database systems is to reduce the running time
from the impractical O(n2
) to O(n log n) – at least if the result of the join
is not too large – and running time of the order n10
is completely out of the
question).
The question is, then, whether sometimes (or always) one can ﬁnd better
algorithms for evaluating FO queries on ﬁnite structures. In particular,
it would be ideal if one could always guarantee time linear in A .
Since the combined complexity of FO queries is Pspace-complete, something
must be exponential, so in that case we would expect the complexity to be
O g( Φ )· A , where g : N → N is some function.
This is the setting of parameterized complexity, where the standard input of
a problem is split into the input part and the parameter part, and one looks
for ﬁxed parameter tractable problems that admit algorithms with running
time O(g(π)·np
) for a ﬁxed p; here π is the size of the parameter, and n is the
size of the input. It is known that even some NP-hard problems become ﬁxed
parameter tractable if the parameters are chosen correctly. For example, SET
COVER is the problem whose input is a set V , a family F of its subsets, and
a number k, and the output is “yes” if there is a subset of V of size at most
k that intersects every member of F. This problem is NP-complete, but if we
choose π = k + maxF ∈F |F | to be the parameter, it becomes solvable in time
O(ππ+1
· |F |), thus becoming linear in what is likely the largest part of the
input.
We now formalize the concept of ﬁxed-parameter tractability.
Deﬁnition 6.18. Let L be a logic, and C a class of structures. The modelchecking
problem for L on C is the problem to check, for a given structure
A ∈ C and an L-sentence Φ, whether A |= Φ.
We say that the model-checking problem for L on C is ﬁxed-parameter
tractable, if there is a constant p and a function g : N → N such that for
every A ∈ C and every L-sentence Φ, checking whether A |= Φ can be done in
time
g( Φ ) · A
p
.
We say that the model-checking problem for L on C is ﬁxed-parameter
linear, if p = 1; that is, if there is a function g : N → N such that for every
A ∈ C and every L-sentence Φ, checking whether A |= Φ can be done in time
g( Φ )· A .
We now prove that on structures of bounded degree, model-checking for
FO is ﬁxed-parameter linear. The proof is based on Hanf-locality of FO.
6.6 Parametric Complexity and Locality 101
Theorem 6.19. Fix l > 0. Then the model-checking problem for FO on
STRUCTl[σ] is ﬁxed-parameter linear.
Proof. We use threshold equivalence and Theorem 4.24. Given l and Φ, we can
ﬁnd numbers d and m such that for every A, B ∈ STRUCTl[σ], it is the case
that A⇆thr
d,mB implies that A and B agree on Φ.
We know that for structures of ﬁxed degree l, the upper bound on the number
of isomorphism types of radius d neighborhoods of a point is determined
by d, l, and σ. We assume that τ1, . . . , τM enumerate isomorphism types of all
the structures of the form NA
d (a) for A ∈ STRUCTl[σ].
Let ni(A) =| {a | NA
d (a) of type τi} |. With each structure A, we now
associate an M-tuple t(A) = (t1, . . . , tM ) such that
ti =
ni(A), if ni(A) ≤ m,
∗ otherwise.
Let T be the set of all M-tuples whose elements come from {1, . . ., m} ∪ {∗}.
Note that the number of such tuples is (m + 1)M
, which depends only on l
and Φ, and that each t(A) is a member of T .
From Theorem 4.24, t(A) = t(B) implies that A and B agree on Φ. Let T0
be the set of t ∈ T such that for some structure A ∈ STRUCTl[σ], we have
A |= Φ and t(A) = t. We leave it as an exercise for the reader (see Exercise
6.7) to show that T0 is computable.
The idea of the algorithm then is to compute, for a given structure A,
the tuple t(A) in linear time. Once this is done, we check if t ∈ T0. The
computation of T0 depends entirely on Φ and l, but not on A; hence the
resulting algorithm has linear running time.
For simplicity, we present the algorithm for computing t(A) for the case
when A is an undirected graph; extension to the case of arbitrary A is straightforward.
We compute, for each node i (assuming that nodes are numbered
0, . . . , n − 1), τ(i), the isomorphism type of its d-neighborhood. For this, we
ﬁrst do a pass over the code of A, and construct an array that, for each node
i, has the list of all nodes j such that there is an edge (i, j). Note that the size
of any such list is at most l. Next, we construct the radius d neighborhood
of each node by looking up its neighbors, then the neighbors of its neighbors,
etc., in the array constructed in the ﬁrst step. After d iterations, we have
radius d neighborhood, whose size is bounded by a number that depends on
the Φ and l but not on A. Now for each i, we ﬁnd j ≤ M such that τ(i) = τj;
since the enumeration τ1, . . . , τM does not depend on A, each such step takes
constant time. Finally, we do one extra pass over (τ(i))i and compute t(A).
Hence, t(A) is computed in linear time. As we already explained, to check
if A |= Φ, we check if t ∈ T0, which takes constant time. Hence, the entire
algorithm has linear running time.
Can one prove a similar result for FO queries on arbitrary structures?
The answer is most likely no, assuming some separation results in complexity
102 6 Complexity of First-Order Logic
theory (see Exercise 6.9). In fact, these results show that even ﬁxed-parameter
tractability is very unlikely for arbitrary structures.
Nevertheless, ﬁxed-parameter tractability can be shown for some interesting
classes of structures.
Recall that a graph H is a minor of a graph G if H can be obtained from a
subgraph of G by contracting edges. A class C of graphs is called minor-closed
if for any G ∈ C and H a minor of G, we have H ∈ C.
Theorem 6.20. If C is a minor-closed class of graphs which does not include
all the graphs, then model-checking for FO on C is ﬁxed-parameter tractable.
The proof of this (hard) theorem is not given here (see Exercise 6.10).
Corollary 6.21. Model-checking for FO on the class of planar graphs is ﬁxedparameter
tractable.
6.7 Conjunctive Queries
In this section we introduce a subclass of FO queries that plays a central
role in database theory. This is the class of conjunctive queries. These are
the queries most commonly asked in relational databases; in fact any SQL
SELECT-FROM-WHERE query that only uses conjunction of attribute equalities
in the WHERE clause is such. Logically this class has a simple characterization.
Deﬁnition 6.22. A ﬁrst-order formula ϕ(x) over a relational vocabulary σ
is called a conjunctive query if it is built from atomic formulae using only
conjunction ∧ and existential quantiﬁcation ∃.
By renaming variables and pushing existential quantiﬁers outside, we can
see that every conjunctive query can be expressed as
ϕ(x) = ∃y
k
i=1
αi(x, y), (6.3)
where each αi is either of the form R(u), where R ∈ σ and u is a tuple of
variables from x, y, or u = v, where u, v are variables from x, y or constant
symbols.
We have seen an example of a conjunctive query in Chap. 1: to test if there
is a path of length k + 1 between x and x′
in a graph E, one can write
∃y1, . . . , yk R(x, y1) ∧ R(y1, y2) ∧ . . . ∧ R(yk−1, yk) ∧ R(yk, x′
).
To see how conjunctive queries can be evaluated, we introduce the concept
of a join of two relations. Suppose we have a formula ϕ(x1, . . . , xm) over
vocabulary σ. For each A ∈ STRUCT[σ], this formula deﬁnes an m-ary relation
ϕ(A) = {a | A |= ϕ(a)}. We can view ϕ(A) as an m-ary relation with
6.7 Conjunctive Queries 103
attributes x1, . . . , xm: that is, a set of ﬁnite mappings {x1, . . . , xm} → A.
Viewing ϕ(A) as a relation with columns and rows lets us name individual
columns.
Suppose now we have two relations over A: an m-ary relation S and an
l-ary relation R, such that R is viewed as a set of mappings t : X → A and S
is viewed as a set of mappings t : Y → A. Then the join of R and S is deﬁned
as
R ⋊⋉ S = {t : X ∪ Y → A | t|X∈ R, t|Y ∈ S}. (6.4)
Suppose that R is ϕ(A) where ϕ has free variables (x, z), and S is ψ(A)
where ψ has free variables (y, z). How can one construct R ⋊⋉ S? According
to (6.4), it consists of tuples (a, b, c) such that ϕ(a, c) and ψ(b, c) hold. Thus,
R ⋊⋉ S = [ϕ ∧ ψ](A).
As another operation corresponding to conjunctive queries, consider again
a relation R viewed as a set of ﬁnite mappings t : X → A, and let Y ⊆ X.
Then the projection of R on Y is deﬁned as
πY (R) = {t : Y → A | ∃t′
∈ R : t′
|Y = t}. (6.5)
Again, if R is ϕ(A), where ϕ has free variables (x, y), then πy(R) is simply
[∃x ϕ(x, y)](R).
Now suppose we have a conjunctive query
ϕ(y) ≡ ∃x α1(u1) ∧ . . . ∧ αn(un) , (6.6)
where each αi(ui) is an atomic formula S(ui) for some S ∈ σ, and ui is a list
of variables among x, y. Then for any structure A,
ϕ(A) = πy α1(A) ⋊⋉ . . . ⋊⋉ αn(A) . (6.7)
A slight extension of the correspondence between conjunctive queries and
the join and projection operations involves queries of the form
ϕ(y) ≡ ∃x α1(u1) ∧ . . . ∧ αn(un) ∧ β(x, u) , (6.8)
where β is a conjunction of formulae u1 = u2, where u1 and u2 are variables
occurring among u1, . . . , un.
Suppose we have a relation R, again viewed as a set of ﬁnite mappings
t : X → A, and a set C of conditions xi = xj, for xi, xj ∈ X. Then the
selection operation, σC(R), is deﬁned as
{t : X → A | t ∈ R, t(xi) = t(xj) for all xi = xj ∈ C}.
If R is ϕ(A), then σC(R) is simply [ϕ ∧ β](R), where β is the conjunction of
all the conditions xi = xj that occur in C.
For β being as in (6.8), let Cβ be the list of all equalities listed in β. Then,
using the selection operation, the most general form of a conjunctive query
above can be translated into
104 6 Complexity of First-Order Logic
πy σCβ
α1(A) ⋊⋉ . . . ⋊⋉ αn(A) . (6.9)
Many common database queries are of the form (6.9): they compute the
join of two or more relations, select some tuples from them, and output
only certain elements of those tuples. These can be expressed as conjunctive
queries.
The data complexity of conjunctive queries is the same as for general FO
queries: uniform AC0
. For the combined and expression complexity, we can
lower the Pspace bound of Theorem 6.16.
Theorem 6.23. The combined and expression complexity of conjunctive
queries are NP-complete (even for Boolean conjunctive queries).
Proof. It is easy to see that the combined complexity is NP: for the query
given by (6.3) and a tuple a, to check if ϕ(a) holds, one has to guess a tuple
b and then check in polynomial time if i αi(a, b) holds.
For completeness, we use reduction from 3-colorability, deﬁned
in Chap. 1 (and known to be NP-complete). Deﬁne a structure
A = {0, 1, 2}, N , where N is the binary inequality relation: N =
{(0, 1), (0, 2), (1, 0), (1, 2), (2, 0), (2, 1)}. Suppose we are given a graph with
the set of nodes U = {a1, . . . , an}, and a set of edges E ⊆ U × U. We then
deﬁne the following Boolean conjunctive query:
∃x1 . . . ∃xn
(ai,aj )∈E
N(xi, xj). (6.10)
Note that for a given graph U, E , the query Φ can be constructed in deterministic
logarithmic time.
For the query Φ given by (6.10), A |= Φ iﬀ there is an assignment of
variables xi, 1 ≤ i ≤ n, to {0, 1, 2} such that for every edge (ai, aj), the corresponding
values xi and xj are diﬀerent. That is, A |= Φ iﬀ U, E is 3-colorable,
which provides the desired reduction, and thus proves NP-completeness for
the combined (and expression, since A is ﬁxed) complexity of conjunctive
queries.
As for the data complexity of conjunctive queries, so far we have seen no
results that would distinguish it from the data complexity of FO. We shall
now see one result that lowers the complexity of conjunctive query evaluation
rather signiﬁcantly, under certain assumptions on the structure of queries.
Unlike Theorem 6.19, this result will apply to arbitrary structures.
Recall that in general, an FO sentence Φ can be evaluated on a structure
A in time O( Φ · A
k
), where k is the width of Φ. We shall now lower this
to O( Φ · A ) for the class of acyclic conjunctive queries. That is, for a
certain class of queries, we shall prove that they are ﬁxed-parameter linear on
the class of all ﬁnite structures. To deﬁne this class of queries, we need a few
preliminary deﬁnitions.
6.7 Conjunctive Queries 105
Let H be a hypergraph: that is, a set U and a set E of hyper-edges, or subsets
of U. A tree decomposition of H is a tree T together with a set Bt ⊆ U for
each node t of T such that the following two conditions hold:
1. For every a ∈ U, the set {t | a ∈ Bt} is a subtree of T .
2. Every hyper-edge of H is contained in one of the Bt’s.
A hypergraph H is acyclic if there exists a tree decomposition of H such that
each Bt, t ∈ T , is a hyper-edge of H.
Deﬁnition 6.24. Given a conjunctive query ϕ(y) ≡ ∃x α1(u1) ∧ . . . ∧
αn(un) , its hypergraph H(ϕ) is deﬁned as follows. Its set of nodes is the set
of all variables used in ϕ, and its hyper-edges are precisely u1, . . . , un.
We say that ϕ is acyclic if the hypergraph H(ϕ) is acyclic.
For example, let Φ ≡ ∃x∃y∃z R(x, y) ∧ R(y, z). Then H(Φ) is a hypergraph
on {x, y, z} with edges {(x, y), (y, z)}. A tree decomposition of H(Φ)
would have two nodes, say t1 and t2, with an edge from t1 to t2, and
Bt1 = {x, y}, Bt2 = {y, z}. Hence, Φ is acyclic.
As a diﬀerent example, let Φ′
≡ ∃x∃y∃z R(x, y) ∧ R(y, z) ∧ R(z, x). Then
H(Φ′
) is a hypergraph on {x, y, z} with edges {(x, y), (y, z), (z, x)}. Assume it
is acyclic. Then there is some tree decomposition of H(Φ′
) in which the sets
Bt include {x, y}, {y, z}, {x, z}. By a straightforward inspection, there is no
way to assign these sets to nodes of a tree so that condition 1 of the deﬁnition
of tree decomposition would hold. Hence, Φ′
is not acyclic.
In general, for binary relations, hypergraph and graph acyclicity coincide.
To give an example involving hyper-edges, consider a query
Ψ ≡ ∃x∃y∃z∃u∃v R(x, y, z) ∧ R(z, u, v) ∧ S(u, z) ∧ S(x, y) ∧ S(v, w) .
Its hypergraph has hyper-edges {x, y, z}, {z, u, v}, {u, z}, {x, y}, {v, w}. The
maximal edges of this hypergraph are shown in Fig. 6.2 (a). This hypergraph
is acyclic. Indeed, consider a tree with three nodes, t1, t2, t3, and edges (t1, t2)
and (t1, t3). Deﬁne Bt1 as {z, u, v}, Bt2 as {x, y, z}, and Bt3 as {v, w} (see
Fig. 6.2 (b)). This deﬁnes an acyclic tree decomposition of H(Ψ).
If, on the other hand, we consider a query
Ψ′
≡ ∃x∃y∃z∃u∃v R(x, y, z) ∧ R(z, u, v) ∧ R(x, v, w)
then one can easily check that H(Φ′
) (shown in Fig. 6.2 (c)) is not acyclic.
We now show that acyclic conjunctive queries are ﬁxed-parameter
tractable (in fact, ﬁxed-parameter linear) over arbitrary structures. The result
below is given for Boolean conjunctive queries; for extension to queries with
free variables, see Exercise 6.13.
106 6 Complexity of First-Order Logic
x y z
u
v w
t1
t2 t3
{z, u, v}
{x, y, z} {v, w}
x y z
u
v
w
(a) (b) (c)
Fig. 6.2. Cyclic and acyclic hypergraphs
Theorem 6.25. Let Φ be a Boolean acyclic conjunctive query over σstructures,
and let A ∈ STRUCT[σ]. Then checking whether A |= Φ can be
done in time O( Φ · A ).
Proof. Let Φ be
Φ ≡ ∃x1 . . . xm
n
i=1
αi(ui),
where each αi(ui) is of the form S(ui) for S ∈ σ, and ui contains some
variables from x. The case when some of the αi’s are variable equalities can
be shown by essentially the same argument, by adding one selection over the
join of all αi(A)’s.
We use a known result that if H is acyclic, then its tree decomposition
satisfying the condition that each Bt is a hyper-edge of H can be computed in
linear time. Furthermore, one can construct this decomposition so that Bt1 ⊆
Bt2 for any t1 = t2. Hence, we assume that we have such a decomposition
(T , (Bt)t∈T ) for H(Φ), computed in time O( Φ ). Let ≺ denote the partial
order of T , with the root being the smallest node.
From the acyclicity of H, it follows that there is a bijection between maximal,
with respect to ⊆, sets ui, and nodes t of T . For each i, let νi be the
node t such that ui is contained in Bt. This node is unique: we look for the
maximal uj that contains ui, and ﬁnd the unique node t such that Bt = uj.
We now deﬁne
Rt = ⋊⋉
i∈[1,n]
νi=t
αi(A). (6.11)
Our goal is now to compute the join of all Rt’s, since (6.7) implies that
A |= Φ ⇔ ⋊⋉
t∈T
Rt = ∅. (6.12)
To show that (6.11) and (6.12) yield a linear time algorithm, we
need two complexity bounds on computing projections and joins: πX(R)
can be computed in time O( R ), and R ⋊⋉ S can be computed in time
O( R + S + R ⋊⋉ S ) (see Exercise 6.12).
6.7 Conjunctive Queries 107
To see that each Rt can be computed in linear time, let it be such that
uit = Bt (it exists since the query is acyclic). Then
Rt = αit (uit ) ⋊⋉ αi1 (ui1 ) ⋊⋉ . . . ⋊⋉ αik
(uik
),
where all uij ⊆ uit , j ≤ k. Hence Rt ⊆ αit (A). Using the above bounds for
computing joins and projections, we conclude that the entire family Rt, t ∈ T ,
can be computed in time O( Φ · A ).
We deﬁne
Pt = ⋊⋉
v t
Rv,
where is the partial order of T , with the root r being the smallest element.
If t is a leaf of T , then Pt = Rt. Otherwise, let t be a node with children
t1, . . . , tl. Then
Pt = Rt ⋊⋉ ⋊⋉
1≤i≤l
⋊⋉
v ti
Rv = Rt ⋊⋉ ⋊⋉
1≤i≤l
Pti . (6.13)
Using (6.13) inductively, we compute Pr = ⋊⋉tRt in time
O( T · maxt Rt ). We saw that Rt ≤ A for each t, and, furthermore,
T can be computed from Φ in linear time. Hence, Pr can be found in
time O( Φ · A ), which together with (6.12) implies that A |= Φ can be
tested with the same bounds. This completes the proof.
There is another interesting way to connect tree decompositions with
tractability of conjunctive queries. Suppose we have a conjunctive query ϕ(x)
given by (6.3). We deﬁne its graph G(ϕ), whose set of vertices is the set of
variables used in ϕ, with an edge between two variables u and v if there
is an atom αi such that both u and v are its free variables. For example,
if ϕ(x, y) ≡ ∃z∃v R(x, y, z) ∧ S(z, v), then G(ϕ) has undirected edges
(x, y), (x, z), (y, z), and (z, v).
A tree decomposition of G(ϕ) is a tree decomposition, as deﬁned earlier,
when we view G(ϕ) as a hypergraph. In other words, it consists of a tree T ,
and a set Bt of nodes of G(ϕ) for each t ∈ T , such that
1. {t | v ∈ Bt} forms a subtree of T for each v, and
2. for every edge (u, v), both u and v are in one of the Bt’s.
The width of a tree decomposition is maxt |Bt | −1. The treewidth of G(ϕ)
is the minimum width of a tree decomposition of G(ϕ). It is easy to see that
the treewidth of a tree is 1.
For k > 0, let CQk be the class of conjunctive queries ϕ such that the
treewidth of G(ϕ) is at most k. Then the following can be shown.
Theorem 6.26. Let k > 0 be ﬁxed, and let ϕ be a query from CQk.
Then, for every structure A, one can compute ϕ(A) in polynomial time in
Φ + A + ϕ(A) . In particular, Boolean queries from CQk can be
evaluated in polynomial time in Φ + A .
108 6 Complexity of First-Order Logic
In other words, conjunctive-query evaluation becomes tractable for queries
whose graphs have bounded treewidth. Exercise 6.15 shows that the converse
holds, under certain complexity-theoretic assumptions.
6.8 Bibliographic Notes
The notions of data, expression, and combined complexity are due to Vardi
[244], see also [3].
Representation of ﬁrst-order formulae by Boolean circuits is fairly standard,
see, e.g., books [133] and [247]. Proposition 6.6 was explicitly shown by
Vardi [245].
Theorem 6.7 is perhaps the deepest result in circuit complexity. It was
proved by Furst, Saxe, and Sipser [86] (see also Ajtai [10] and Denenberg,
Gurevich, and Shelah [55]).
The notion of uniformity and its connection with logical descriptions of
complexity classes was studied by Barrington, Immerman, and Straubing [16].
Proofs of FO(<, BIT) = FO(+++,×××) are given in [133] and – partially – in
[247]. The proof of expressibility of BIT (Theorem 6.12) follows closely the
presentation in Buss [29] and Cook [40].
Pspace-completeness of FO (expression complexity) and of QBF is due
to Stockmeyer [222].
The idea of using parameterized complexity as a reﬁnement of the notions
of the data and expression complexity was proposed by Yannakakis [250], and
developed by Papadimitriou and Yannakakis [196]. Parameterized complexity
is treated in a book by Downey and Fellows [58]; see also surveys by Grohe
[109, 111]. Theorem 6.19 is from Seese [219], Theorem 6.20 is from Flum and
Grohe [81].
The notion of conjunctive queries is a fundamental one in database theory,
see [3]. NP-completeness of conjunctive queries (combined complexity) is due
to Chandra and Merlin [34]. Fixed-parameter linearity of acyclic conjunctive
queries is due to Yannakakis [249]; the presentation here follows closely Flum,
Frick, and Grohe [80]. A linear time algorithm for producing tree decompositions
of hypergraphs, used in Theorem 6.25, is due to Tarjan and Yannakakis
[228]. Flum, Frick, and Grohe [80] show how to extend the notion of acyclicity
to FO formulae. Theorem 6.26 and Exercise 6.15 are from Grohe, Schwentick,
and Segouﬁn [114]. See also Gottlob, Leone, and Scarcello [96] for additional
results on the complexity of acyclic conjunctive queries.
6.9 Exercises 109
Sources for exercises:
Exercise 6.4: Dawar et al. [50]
Exercise 6.5: Immerman [133], Vollmer [247]
Exercise 6.9: Papadimitriou and Yannakakis [196]
Exercise 6.12: Flum, Frick, and Grohe [80]
Exercises 6.13 and 6.14: Flum, Frick, and Grohe [80]
Yannakakis [249]
Exercise 6.15: Grohe, Schwentick, and Segouﬁn [114]
Exercise 6.16: Flum and Grohe [81]
Exercise 6.18: Gottlob, Leone, and Scarcello [96]
Exercise 6.19: Chandra and Merlin [34]
6.9 Exercises
Exercise 6.1. Show that none of the following is expressible in FO(All): transitive
closure of a graph, testing for planarity, acyclicity, 3-colorability.
Exercise 6.2. Prove that ⌊
√
x⌋ is expressible in FO(+++,×××).
Exercise 6.3. Consider two countable undirected graphs. For the ﬁrst one, the universe
is N, and we have an edge between i and j iﬀ BIT(i, j) or BIT(j, i) is true. In
the other graph, the universe is N+ = {n ∈ N | n > 0} and there is an edge between
n and m, for n > m, iﬀ n is divisible by pm, the mth prime. Prove that these graphs
are isomorphic.
Hint: if you ﬁnd it hard to do all the calculations required for the proof, you may
want to wait until Chap. 12, which introduces some powerful logical tools that let
you prove results of this kind without using any number theory at all (see Exercise
12.9, part a).
Exercise 6.4. Show that the standard linear order is expressible in FO(BIT). Conclude
that FO(+++,×××) = FO(BIT).
Exercise 6.5. Prove Lemma 6.14.
You may ﬁnd it useful to show that the following predicate is expressible in
FO(+++,×××): BitSum(x, y) iﬀ the number of ones in the binary representation of x is
y.
Exercise 6.6. Prove that QBF is Pspace-complete.
Exercise 6.7. We stated in the proof of Theorem 6.19 that the set of tuples t ∈ T
for which there exists a structure A with t(A) = t and A |= Φ is computable. Prove
this statement, using the assumption that A is of bounded degree. Derive bounds
on the constant in the O( A ) running time.
Exercise 6.8. Give an example of a two-element structure over which the expression
complexity of conjunctive queries is NP-hard. Recall that in the proof of Theorem
6.23, we used a structure whose universe had three elements.
110 6 Complexity of First-Order Logic
Exercise 6.9. In this exercise, we refer to parameterized complexity class W [1]
whose deﬁnition can be found in [58, 81]. This class is believed to contain problems
which are not ﬁxed-parameter tractable.
Prove that checking A |= Φ, with Φ being the parameter, is W [1]-hard, even if
Φ is a conjunctive query. Thus, it is unlikely that FO (or even conjunctive queries)
are ﬁxed-parameter tractable.
Exercise 6.10. Derive Theorem 6.20 from the following facts. H is an excluded
minor of a class of graphs C if no G ∈ C has H as a minor. If such an H exists, then
C is called a class of graphs with an excluded minor.
• If C is a minor-closed class of graphs, membership in C can be veriﬁed in Ptime
(see Robertson and Seymour [205]).
• If C is a Ptime-decidable class of graphs with an excluded minor, then checking
Boolean FO queries on C is ﬁxed-parameter tractable (see Flum and Grohe [81]).
Exercise 6.11. Prove that an order-invariant conjunctive query is FO-deﬁnable
without the order relation. That is, (CQ+ <)inv ⊆ FO.
Exercise 6.12. Prove that R ⋊⋉ S can be evaluated in O( R + S + R ⋊⋉ S ).
Exercise 6.13. Extend the proof of Theorem 6.25 to deal conjunctive queries with
free variables, by showing that ϕ(A), for an acyclic ϕ, can be computed in time
O( ϕ · A · ϕ(A) ). Also show that if the set of free variables of ϕ is contained
in one of the Bt’s, for a tree decomposition of H(ϕ), then the evaluation can be done
in time O( ϕ · A ).
Exercise 6.14. Extend Theorem 6.25 and Exercise 6.13 to conjunctive queries with
negation; that is, conjunctive queries in which some atoms are of the form x = y,
where x and y are variables.
Exercise 6.15. Under the complexity-theoretic assumption that W [1] contains
problems which are not ﬁxed-parameter tractable (see Exercise 6.9), the converse
to Theorem 6.26 holds: if for a class of graphs C, it is the case that every
conjunctive query ϕ with G(ϕ) ∈ C can be evaluated in time polynomial in
Φ + A + ϕ(A) , then C has bounded treewidth (i.e., there is a constant
k > 0 such that every graph in C has treewidth at most k).
Exercise 6.16. We say that a class of structures C ⊆ STRUCT[σ] has bounded
treewidth if there is k > 0 such that for every A ∈ C, the treewidth of its Gaifman
graph is at most k. Prove that FO is ﬁxed-parameter tractable on classes of
structures of bounded treewidth.
Exercise 6.17. Give an example of a conjunctive query which is of treewidth 2 but
not acyclic. Also, give an example of a family of acyclic conjunctive queries that has
queries of arbitrarily large treewidth.
Exercise 6.18. Given a hypergraph H, its hypertree decomposition is a triple
(T, (Bt)t∈T , (Ct)t∈T ) such that (T, (Bt)t∈T ) is a tree decomposition of H, and each
Ct is a set of hyper-edges. It is required to satisfy the following two properties for
every t ∈ T:
6.9 Exercises 111
1. Bt ⊆
S
Ct;
2.
S
Ct ∩
S
v t
Bv ⊆ Bt.
The hypertree width of H is deﬁned as the minimum value of maxt∈T | Ct |, taken
over all hypertree decompositions of H.
Prove the following:
(a) A hypergraph is acyclic iﬀ its hypertree width is 1.
(b) For each ﬁxed k, conjunctive queries whose hypergraphs have hypertree width
at most k can be evaluated in polynomial time.
Note that this does not contradict the result of Exercise 6.15 which refers to
graph-based (as opposed to hypergraph-based) classes of conjunctive queries.
Exercise 6.19. Suppose ϕ1(x) and ϕ2(x) are two conjunctive queries. We write
ϕ1 ⊆ ϕ2, if ϕ1(A) ⊆ ϕ2(A) for all A (in other words, ∀x ϕ1(x) → ϕ2(x) is valid in
all ﬁnite structures). We write ϕ1 = ϕ2 if both ϕ1 ⊆ ϕ2 and ϕ2 ⊆ ϕ1 hold.
Prove that testing both ϕ1 ⊆ ϕ2 and ϕ1 = ϕ2 is NP-complete.
Exercise 6.20.∗
Use Ehrenfeucht-Fra¨ıss´e games to prove that parity is not expressible
in FO(+++,×××).
7
Monadic Second-Order Logic and Automata
We now move to extensions of ﬁrst-order logic. In this chapter we introduce
second-order logic, and consider its often used fragment, monadic secondorder
logic, or MSO, in which one can quantify over subsets of the universe.
We study the expressive power of this logic over graphs, proving that its
existential fragment expresses some NP-complete problems, but at the same
time cannot express graph connectivity. Then we restrict our attention to
strings and trees, and show that, over them, MSO captures regular string and
tree languages. We explore the connection with automata to prove further
deﬁnability and complexity results.
7.1 Second-Order Logic and Its Fragments
We have seen a few examples of second-order formulae in Chap. 1. The idea is
that in addition to quantiﬁcation over the elements of the universe, we can also
quantify over subsets of the universe, as well as binary, ternary, etc., relations
on it. For example, to express the query even, we can say that there are two
disjoint subsets U1 and U2 of the universe A such that A = U1 ∪ U2 and there
is a one-to-one mapping F : U1 → U2. This is expressed by a formula
∃U1 ∃U2 ∃F ϕ,
where ϕ is an FO formula in the vocabulary (U1, U2, F) stating that U1 and U2
form a partition of the universe (∀x (U1(x) ↔ ¬U2(x))), and that F ⊆ U1 ×U2
is functional, onto, and one-to-one.
Note that the formula ϕ in this example has three second-order free variables
U1, U2, and F. We now formally deﬁne second-order logic.
Deﬁnition 7.1 (Second-order logic). The deﬁnition of second-order logic,
SO, extends the deﬁnition of FO with second-order variables, ranging over subsets
and relations on the universe, and quantiﬁcation over such variables. We
114 7 Monadic Second-Order Logic and Automata
assume that for every k > 0, there are inﬁnitely many variables Xk
1 , Xk
2 , . . .,
ranging over k-ary relations. A formula of SO can have both ﬁrst-order and
second-order free variables; we write ϕ(x, X) to indicate that x are free ﬁrstorder
variables, and X are free second-order variables.
Given a vocabulary σ that consists of relation and constant symbols, we
deﬁne SO terms and formulae, and their free variables, as follows:
• Every ﬁrst-order variable x, and every constant symbol c, are ﬁrst-order
terms. The only free variable of a term x is the variable x, and c has no
free variables.
• There are three kinds of atomic formulae:
– FO atomic formulae; that is, formulae of the form
– t = t′
, where t, t′
are terms, and
– R(t ), where t is a tuple of terms, and R ∈ σ, and
– X(t1, . . . , tk), where t1, . . . , tk are terms, and X is a second-order variable
of arity k. The free ﬁrst-order variables of this formula are free
ﬁrst-order variables of t1, . . . , tk; the free second-order variable is X.
• The formulae of SO are closed under the Boolean connectives ∨, ∧, ¬, and
ﬁrst-order quantiﬁcation, with the usual rules for free variables.
• If ϕ(x, Y, X) is a formula, then ∃Y ϕ(x, Y, X) and ∀Y ϕ(x, Y, X) are
formulae, whose free variables are x and X.
The semantics is deﬁned as follows. Suppose A ∈ STRUCT[σ]. For each
formula ϕ(x, X), we deﬁne the notion A |= ϕ(b, B), where b is a tuple of
elements of A of the same length as x, and for X = (X1, . . . , Xl), with each
Xi being of arity ni, B = (B1, . . . , Bl), where each Bi is a subset of Ani
.
We give the semantics only for constructors that are diﬀerent from those
for FO:
• If ϕ(x, X) is X(t1, . . . , tk), where X is k-ary and t1, . . . , tk are terms, with
free variables among x, then A |= ϕ(b, B) iﬀ the tuple (tA
1 (b), . . . , tA
k (b)) is
in B.
• If ϕ(x, X) is ∃Y ψ(x, Y, X), where Y is k-ary, then A |= ϕ(b, B) if for
some C ⊆ Ak
, it is the case that A |= ψ(b, C, B).
• If ϕ(x, X) is ∀Y ψ(x, Y, X), and Y is k-ary, then A |= ϕ(b, B) if for all
C ⊆ Ak
, we have A |= ψ(b, C, B).
We know that every FO formula can be written in the prenex normal form
Q1x1 . . . Qnxn ψ, where Qi’s are ∃ or ∀, and ψ is quantiﬁer-free. Likewise,
every SO formula can be written as a sequence of ﬁrst- and second-order
quantiﬁers, followed by a quantiﬁer-free formula. Furthermore, note the following
equivalences:
7.1 Second-Order Logic and Its Fragments 115
∃x Q ϕ(x, ·) ↔ ∃X Q ∃x (X(x) ∧ ϕ(x, ·)) (7.1)
∀x Q ϕ(x, ·) ↔ ∀X Q ∃!x X(x) → ∀x (X(x) → ϕ(x, ·)) , (7.2)
where Q stands for an arbitrary sequence of ﬁrst- and second-order quantiﬁers.
Using those inductively, we can see that every SO formula is equivalent to a
formula in the form
Q1X1 . . . QnXnQ1x1 . . . Qlxl ψ, (7.3)
where QiXi are second-order quantiﬁers, Qjxj are ﬁrst-order quantiﬁers, and
ψ is quantiﬁer-free.
We now deﬁne some restrictions of the full SO logic of interest to us. The
ﬁrst one is the central notion studied in this chapter.
Deﬁnition 7.2. Monadic SO logic, or MSO, is deﬁned as the restriction of
SO where all second-order variables have arity 1.
In other words, in MSO, second-order variables range over subsets of the
universe.
Rules (7.1) and (7.2) do not take us out of MSO, and hence every MSO
formula is equivalent to one in the normal form (7.3), where the second-order
quantiﬁers precede the ﬁrst-order quantiﬁers.
Deﬁnition 7.3. Existential SO logic, or ∃SO, is deﬁned as the restriction of
SO that consists of the formulae of the form
∃X1 . . . ∃Xn ϕ,
where ϕ does not have any second-order quantiﬁcation. If, furthermore, all
Xi’s have arity 1, the resulting restriction is called existential monadic SO,
or ∃MSO.
If the second-order quantiﬁer preﬁx consists only of universal quantiﬁers,
we speak of the universal SO logic, or ∀SO, and its further restriction to
monadic quantiﬁers is referred to as ∀MSO.
In other words, an ∃SO formula starts with a second-order existential preﬁx
∃X1 . . . ∃Xn, and what follows is an FO formula ϕ (in the original vocabulary
expanded with X1, . . . , Xn).
Formula (1.2) from Chap. 1 stating the 3-colorability of a graph is an
example of an ∃MSO formula, while (1.3) stating the existence of a clique of
a given size is an example of an ∃SO formula.
Deﬁnition 7.4. The quantiﬁer rank of an SO formula is deﬁned as the maximum
depth of quantiﬁer-nesting, including both ﬁrst-order and second-order
quantiﬁers. That is, the rules for the quantiﬁer rank for FO are augmented
with
• qr(∃X ϕ) = qr(∀X ϕ) = qr(ϕ) + 1.
116 7 Monadic Second-Order Logic and Automata
7.2 MSO Games and Types
MSO can be characterized by a type of Ehrenfeucht-Fra¨ıss´e game, which is
fairly close to the game we have used for FO. As in the case of FO, the game
is also closely connected with the notion of type.
Let MSO[k] consist of all MSO formulae of quantiﬁer-rank at most k.
An MSO rank-k m, l-type is a consistent set S of MSO[k] formulae with m
free ﬁrst-order variables and l free second-order variables such that for every
ϕ(x1, . . . , xm, X1, . . . , Xl) from MSO[k], either ϕ ∈ S or ¬ϕ ∈ S.
Given a structure A, an m-tuple a ∈ A, and an l-tuple V of subsets of A,
the MSO rank-k type of (a, U) in A is the set
mso-tpk(A, a, V ) = {ϕ(x, X) ∈ MSO[k] | A |= ϕ(a, V )}.
Clearly, mso-tpk(A, a, V ) is an MSO rank-k type.
When both a and V are empty, mso-tpk(A) is the set of all MSO[k] sentences
that are true in A.
Just as for FO, a simple inductive argument shows that for each m and
l, up to logical equivalence, there are only ﬁnitely many diﬀerent formulae
ϕ(x1, . . . , xm, X1, . . . , Xl) in MSO[k]. Hence, MSO rank-k m, l types (where
m and l stand for the number of free ﬁrst-order and second-order variables,
respectively) are essentially ﬁnite objects. In fact, just as for FO, one can show
the following result for MSO.
Proposition 7.5. Fix k, l, m.
• There exist only ﬁnitely many MSO rank-k m, l types.
• Let T1, . . . , Ts enumerate all the MSO rank-k m, l types. There exist
MSO[k] formulae αi(x, X), i = 1, . . . , s, such that for every structure
A, every m-tuple a of elements of A, and every l-tuple U of subsets of A,
it is the case that A |= αi(a, U) iﬀ mso-tpk(A, a, U) = Ti.
Furthermore, each MSO[k] formula with m free ﬁrst-order variables
and l free second-order variables is equivalent to a disjunction of some of
the αi’s.
Hence, just as in the case of FO, we shall associate rank-k types with their
deﬁning formulae, which are also of quantiﬁer rank k.
We now present the modiﬁcation of Ehrenfeucht-Fra¨ıss´e games for MSO.
Deﬁnition 7.6. An MSO game is played by two players, the spoiler and the
duplicator, on two structures A and B of the same vocabulary σ. The game
has two diﬀerent kinds of moves:
7.2 MSO Games and Types 117
Point move This is the same move as in the Ehrenfeucht-Fra¨ıss´e game for FO:
the spoiler chooses a structure, A or B, and an element of that structure;
the duplicator responds with an element in the other structure.
Set move The spoiler chooses a structure, A or B, and a subset of that structure.
The duplicator responds with a subset of the other structure.
Let a1, . . . , ap ∈ A and b1, . . . , bp ∈ B be the point moves played in the
k-round game, with V1, . . . , Vs ⊆ A and U1, . . . , Us ⊆ B being the set moves
(i.e., p+s = k, and the moves of the same round have the same index). Then
the duplicator wins the game if (a, b) is a partial isomorphism of (A, V ) and
(B, U). If the duplicator has a winning strategy in the k-round MSO game on
A and B, we write A ≡MSO
k B.
Furthermore, we write (A, a0, V0) ≡MSO
k (B, b0, U0) if the duplicator has a
winning strategy in the k-round MSO game on A and B starting with position
((a0, V0), (b0, U0)). That is, when k rounds of the game a, b, V , U are played,
(a0a, b0b) is a partial isomorphism between (A, V0, V ) and (B, U0, U).
This game captures the expressibility in MSO[k].
Theorem 7.7. Given two structures A and B, two m-tuples a0, b0 of elements
of A and B, and two l-tuples V0, U0 of subsets of A and B, we have
mso-tpk(A, a0, V0) = mso-tpk(B, b0, U0) ⇔ (A, a0, V0) ≡MSO
k (B, b0, U0).
That is, (A, a0, V0) ≡MSO
k (B, b0, U0) iﬀ for every MSO[k] formula ϕ(x, X),
A |= ϕ(a0, V0) ⇔ B |= ϕ(b0, U0).
The proof is essentially the same as the proof of Theorem 3.9, and is left
to the reader as an exercise (see Exercise 7.1).
In the case of sentences, Theorem 7.7 gives us the following.
Corollary 7.8. If A and B are two structures of the same vocabulary, then
A ≡MSO
k B iﬀ A and B agree on all the sentences of MSO[k].
As for FO, the method of games is complete for expressibility in MSO.
Proposition 7.9. A property P of σ-structures is expressible in MSO iﬀ there
is a number k such that for every two σ-structures A, B, if A has the property
P and B does not, then the spoiler wins the k-round MSO game on A and B.
Proof. Assume P is expressible by a sentence Φ of quantiﬁer rank k. Let
α1, . . . , αs enumerate all the MSO rank-k types (without free variables). Then
P is equivalent to a disjunction of some of the αi’s. Hence, if A has P and B
does not, there is some i such that A |= αi and B |= ¬αi, and thus A ≡MSO
k B.
118 7 Monadic Second-Order Logic and Automata
Conversely, suppose that we can ﬁnd k ≥ 0 such that for every A having
P and B not having P, we have A ≡MSO
k B. Now take any two structures
A1 and A2 such that A1 ≡MSO
k A2. Suppose A1 has P. If A2 does not have
P, we would conclude A1 ≡MSO
k A2, which contradicts the assumption; hence
A2 has P as well. Thus, P is a union of rank-k MSO types. Since there are
ﬁnitely many of them, and each is deﬁnable by a rank-k MSO sentence, we
conclude that P is MSO[k]-deﬁnable.
Most commonly, we use the contrapositive of this proposition, which tells
us when some property is not expressible in MSO.
Corollary 7.10. A property P of σ-structures is not expressible in MSO iﬀ
for every k ≥ 0, one can ﬁnd Ak, Bk ∈ STRUCT[σ] such that:
• Ak has the property P,
• Bk does not have the property P, and
• Ak ≡MSO
k Bk.
Our next goal it to use games to study expressibility in MSO. A useful
technique is the composition of MSO games, which allows us to construct more
complex games from simpler ones. Similarly to Exercise 3.15, we can show the
following.
Lemma 7.11. Let A1, A2, B1, B2 be σ-structures, and let A be the disjoint
union of A1 and A2, and B the disjoint union of B1 and B2. Assume
A1 ≡MSO
k B1 and A2 ≡MSO
k B2. Then A ≡MSO
k B.
Proof sketch. Assume the spoiler makes a point move, say a in A. Then a is
in A1 or A2. Suppose a is in A1; then the duplicator selects a response b in
B1 according to his winning strategy on A1 and B1.
Assume the spoiler makes a set move, say U ⊆ A. The universe A is the
disjoint union of A1 and A2, the universes of A1 and A2. Let Ui = U ∩ Ai, i =
1, 2. Let Vi be the response of the duplicator to Ui in Bi, i = 1, 2, according
to the winning strategy. Then the response to U is V = V1 ∪ V2. It is routine
to verify that, using this strategy, the duplicator wins in k rounds.
As an application of the composition argument, we prove the following.
Proposition 7.12. Let σ = ∅. Then even is not MSO-expressible.
Proof. We claim that for every A and B with |A|, |B| ≥ 2k
, it is the case that
A ≡MSO
k B. Clearly this implies that even is not MSO-deﬁnable. Since σ = ∅,
we shall write U ≡MSO
k V instead of the more formal (U, ∅) ≡MSO
k (V, ∅).
We prove the statement by induction on k. The cases of k = 0 and k = 1
are easy, so we show how to go from k to k + 1.
Suppose A and B with | A |, |B| ≥ 2k+1
are given. We only consider a
set move by the spoiler, since any point move a can be identiﬁed with the
set move {a}. Assume that in the ﬁrst move, the spoiler plays U ⊆ A. We
distinguish the following cases:
7.3 Existential and Universal MSO on Graphs 119
1. |U| ≤ 2k
. Then pick an arbitrary set V ⊆ B such that |V |=|U |. We have
U ∼= V (and thus U ≡MSO
k V ), and A − U ≡MSO
k B − V – the latter is by
the induction hypothesis, since |A − U|, |B − V | ≥ 2k
. Combining the two
games, we see that from the position (U, V ) on A and B, the duplicator
can continue the game for k rounds, and hence A ≡MSO
k+1 B.
2. |A−U| ≤ 2k
. This case is treated in exactly the same way as the previous
one.
3. |U| > 2k
and |A − U| > 2k
. Since |B| ≥ 2k+1
, we can ﬁnd a subset
V ⊆ B such that both |V | and |B − V | are at least 2k
. By the induction
hypothesis, we know that U ≡MSO
k V and A − U ≡MSO
k B − V , and hence
from (U, V ), the duplicator can play for k more rounds, thus proving
A ≡MSO
k+1 B.
Suppose now that the vocabulary is expanded by one binary symbol <
interpreted as a linear ordering; that is, we deal with ﬁnite linear orders. Then
even is expressible in MSO. To see this, we let our MSO sentence guess the
set that consists of alternating elements a1, a3, . . . , a2n+1, . . . in the ordering
a1 < a2 < a3 < . . ., such that the ﬁrst element is in this set, and the last
element is not:
∃X


∀x (ﬁrst(x) → X(x))
∧ ∀x (last(x) → ¬X(x))
∧ ∀x∀y succ<(x, y) → (X(x) ↔ ¬X(y))

 ,
where ﬁrst(x) stands for ∀y (y ≥ x), last(x) stands for ∀y (y ≤ x), and
succ<(x, y) stands for (x < y) ∧ ¬∃z (x < z ∧ z < y).
Thus, as for FO, we have a separation between the ordered and unordered
case. Noticing that even is an order-invariant query, we obtain the following.
Corollary 7.13. MSO (MSO+ <)inv.
Note the close connection between Corollary 7.13 and Theorem 5.3: the
latter showed that FO (FO+ <)inv, and the separating example was the
parity of the number of atoms of a Boolean algebra. We used the Boolean
algebra to simulate monadic second-order quantiﬁcation; in MSO it comes for
free, and hence even worked as a separating query.
7.3 Existential and Universal MSO on Graphs
In this section we study two restrictions of MSO: existential MSO, or ∃MSO,
and universal MSO, or ∀MSO, whose formulae are respectively of the form
∃X1 . . . ∃Xn ϕ
and
120 7 Monadic Second-Order Logic and Automata
∀X1 . . . ∀Xn ϕ,
where ϕ is ﬁrst-order.
These also are commonly found in the literature under the names monadic
Σ1
1 for ∃MSO and monadic Π1
1 for ∀MSO, where monadic, of course, refers to
second-order quantiﬁcation over sets. In general, Σ1
k consists of formulae whose
preﬁx of second-order quantiﬁers consists of k blocks, with the ﬁrst block
being existential. For example, a formula ∃X1∃X2∀Y1∃Z1ψ is a Σ1
3 -formula.
The class Π1
k is deﬁned likewise, except that the ﬁrst block of quantiﬁers is
universal.
Another name for ∃MSO is monadic NP, and ∀MSO is referred to as
monadic coNP. The reason for these names will become clear in Chap. 9,
when we prove Fagin’s theorem.
We now give an example of a familiar property that separates monadic Π1
1
from monadic Σ1
1 (i.e., ∀MSO from ∃MSO).
Proposition 7.14. Graph connectivity is expressible in ∀MSO, but is not expressible
in ∃MSO.
Proof. A graph is not connected if its nodes can be partitioned into two
nonempty sets with no edges between them:
∃X
∃x X(x) ∧ ∃x ¬X(x)
∧ ∀x∀y (X(x) ∧ ¬X(y) → ¬E(x, y))
(7.4)
Since (7.4) is an ∃MSO sentence, its negation, expressing graph connectivity,
is a universal MSO sentence.
For the converse, we use Hanf-locality. Suppose that connectivity is deﬁnable
by an ∃MSO sentence Φ ≡ ∃X1 . . . ∃Xmϕ. Assume without loss of
generality that m > 0. Since ϕ is a ﬁrst-order sentence (over structures of
vocabulary σ extended with X1, . . . , Xn), it is Hanf-local. Let d = hlr(ϕ), the
Hanf-locality rank of ϕ. That is, if (G, U1, . . . , Um)⇆d(G′
, V1, . . . , Vm), where
G, G′
are graphs and the Ui’s and the Vi’s interpret Xi’s over them, then
(G, U1, . . . , Um) and (G′
, V1, . . . , Vm) agree on ϕ.
We now set K = 2m(2d+1)
and r = (4d+4)K. We claim the following: if G
is an m-colored graph (i.e., a graph on which m unary predicates are deﬁned),
which is a cycle of length at least r, then there exist two nodes a and b such
that the distance between them is at least 2d + 2, and their d-neighborhoods
are isomorphic.
Indeed, for a long enough cycle, the d-neighborhood of each node a is a
chain of length 2d + 1 with a being the middle node. Each node on the chain
can belong to some of the Ui’s, and there are 2m
possibilities for choosing
a subset of indexes 1, . . . , m of Ui’s such that a ∈ Ui. Hence, there are at
most K diﬀerent isomorphism types of d-neighborhoods. If the length of the
cycle is at least (4d + 4)K, then there is one type of d-neighborhoods which
7.3 Existential and Universal MSO on Graphs 121
.
.
.
..
.
. .
.
.
..
a a′
bb′
Fig. 7.1. Illustration for the proof of Proposition 7.14
is realized by at least 4d + 4 elements, and hence two of those elements will
be at distance at least 2d + 2 from each other.
Now let G be a cycle of length at least r. Since G is a connected graph,
we have G |= Φ. Let U1, . . . , Um witness it; that is, (G, U1, . . . , Um) |= ϕ.
Choose a, b such that a ≈
(G,U1,...,Um)
d b and d(a, b) > 2d + 1, and let a′
, b′
be
their successors (in an arbitrarily chosen orientation of G; the one shown in
Fig. 7.1 is the clockwise orientation).
We now construct a new graph G′
by removing edges (a, a′
) and (b, b′
)
from G, and adding edges (a, b′
) and (b, a′
). We claim that for every node c,
N
(G,U1,...,Um)
d (c) ∼= N
(G′
,U1,...,Um)
d (c). (7.5)
First, since a and b are at the distance at least 2d + 2, the d-neighborhood
of any point in G or G′
is a chain of length 2d + 1. If c is at the distance d or
greater from a and b, its d-neighborhood is the same in (G, U1, . . . , Um) and
(G′
, U1, . . . , Um), which means that (7.5) holds.
Suppose now that the distance between c and a is d0 < d, and assume
that c precedes a in the clockwise orientation of G. Then the d predecessors
of c are the same in both structures. Furthermore, since a ≈
(G,U1,...,Um)
d b, in
both structures the d − d0 successors of a agree on all the Ui’s. Hence, (7.5)
holds for c. The remaining cases (again, viewing G in the clockwise order) are
those of c preceding b, or following a or a′
and being at the distance less than
d from them. In all of those cases the same argument as above proves (7.5).
We have thus established a bijection f between the universes of
(G, U1, . . . , Um) and (G′
, U1, . . . , Um) (which is in fact the identity) that wit-
nesses
(G, U1, . . . , Um) ⇆d (G′
, U1, . . . , Um).
Since d = hlr(ϕ), we conclude that (G′
, U1, . . . , Um) |= ϕ, and hence G′
|=
∃X1 . . . ∃Xm ϕ; that is, G′
|= Φ. But G′
is not a connected graph, which
contradicts our assumption that Φ is an ∃MSO sentence deﬁning graph con-
nectivity.
122 7 Monadic Second-Order Logic and Automata
Notice that the formula (7.4) from the proof of Proposition 7.14 shows
that the negation of graph connectivity is ∃MSO-expressible, which means
that ∃MSO can express queries that are not Hanf-local. One can also show
that other forms of locality are violated in ∃MSO (see Exercise 7.6).
We now consider a related property of reachability. We assume that the
language of graphs is augmented by two constants, s and t, and we are interested
in the property, called (s, t)-reachability, that asks whether there is a
path from s to t in a given graph. We have seen that undirected connectivity
is not ∃MSO-deﬁnable; surprisingly, undirected (s, t)-reachability is!
Proposition 7.15. For undirected graphs without loops, (s, t)-reachability is
expressible in ∃MSO.
Proof. Consider the sentence ϕ in the language of graphs expanded with one
unary relation X that says the following:
1. both s and t are in X,
2. both s and t have an edge to exactly one member of X, and
3. every member of X except s and t has edges to precisely two members of
X.
Let Φ be ∃X ϕ. We claim that G |= Φ iﬀ there is a path from s to t in G.
Indeed, if there is a path from s to t, we can take X to be the shortest path
from s to t. Conversely, if (G, X) |= ϕ, then X is a path that starts in s; since
the graph G is ﬁnite, X must contain the last node on the path, which could
be only t.
The approach of Proposition 7.15 does not work for directed graphs, because
of back edges. Consider, for example, a directed graph which consists of
a chain {(s, a1), (a1, a2), (a2, a3), (a3, t)} together with the edge (a3, a1). The
only path between s and t consists of edges s, a1, a2, a3, t; however, if we let
X = {s, a1, a2, a3, t}, the sentence ϕ from the proof of Proposition 7.15 is
false, since a3 has one incoming edge, and two outgoing edges. It seems that
the approach of Proposition 7.15 could be generalized if there is a bound on
degrees in the input graph, and this is indeed the case (Exercise 7.7). However,
in general, one can show a negative result.
Theorem 7.16. Reachability for directed graphs is not expressible in ∃MSO.
We conclude this section by showing that there are games that characterize
expressibility in ∃MSO, much in the same way as Ehrenfeucht-Fra¨ıss´e games
and MSO games characterize expressibility in FO and MSO.
Deﬁnition 7.17. The l, k-Fagin game on two structures A, B ∈ STRUCT[σ]
is played as follows. The spoiler selects l subsets U1, . . . , Ul of A. Then the
duplicator selects l subsets V1, . . . , Vl of B. After that, the spoiler and the
7.3 Existential and Universal MSO on Graphs 123
duplicator play k rounds of the Ehrenfeucht-Fra¨ıss´e game on (A, U1, . . . , Ul)
and (B, V1, . . . , Vl).
The winning condition for the duplicator is that after k rounds of
the Ehrenfeucht-Fra¨ıss´e game, the elements played on (A, U1, . . . , Ul) and
(B, V1, . . . , Vl) form a partial isomorphism between these two structures.
A fairly simple generalization of the previous game proofs shows the fol-
lowing.
Proposition 7.18. A property P of σ-structures is ∃MSO-deﬁnable iﬀ there
exist l and k such that for every A ∈ STRUCT[σ] having P, and for every
B ∈ STRUCT[σ] not having P, the spoiler wins the l, k-Fagin game on A and
B.
This game, however, is often rather inconvenient for the duplicator to play
(after all, we use games to show that a certain property is inexpressible in a
logic, so we need the win for the duplicator). A somewhat surprising result
(see Exercise 7.9) shows that a diﬀerent game that is easier for the duplicator
to win, also characterizes the expressiveness of ∃MSO.
Deﬁnition 7.19. Let P be a property of σ-structures (that is, a class of σstructures
closed under isomorphism). The P, l, k-Ajtai-Fagin game is played
as follows:
1. The duplicator selects a structure A ∈ P.
2. The spoiler selects l subsets U1, . . . , Ul of A.
3. The duplicator selects a structure B ∈ P, and l subsets V1, . . . , Vl of B.
4. The spoiler and the duplicator play k rounds of the Ehrenfeucht-Fra¨ıss´e
game on (A, U1, . . . , Ul) and (B, V1, . . . , Vl).
The winning condition for the duplicator is that after k rounds of
the Ehrenfeucht-Fra¨ıss´e game, the elements played on (A, U1, . . . , Ul) and
(B, V1, . . . , Vl) form a partial isomorphism between these two structures.
Intuitively, this game is easier for the duplicator to win, because he selects
the second structure B and the coloring of it only after he has seen how the
spoiler chose to color the ﬁrst structure A.
Proposition 7.20. A property P of σ-structures is ∃MSO-deﬁnable iﬀ there
exist l and k such that the spoiler has a winning strategy in the P, l, k-AjtaiFagin
game.
Hence, to show that a certain property P is not expressible in ∃MSO, it
suﬃces to construct, for every l and k, a winning strategy for the duplicator
in the P, l, k-Ajtai-Fagin game. This is easier than a winning strategy in the
l, k-Fagin game, since the duplicator sees the sets Ui’s before choosing the
second structure B for the game. An example is given in Exercise 7.10.
124 7 Monadic Second-Order Logic and Automata
7.4 MSO on Strings and Regular Languages
We now study MSO on strings. Recall that a string over a ﬁnite alphabet can
be represented as a ﬁrst-order structure. For example, the string s = abaab
is represented as {1, 2, 3, 4, 5}, <, Pa, Pb , where < is the usual ordering, and
Pa and Pb contain positions in s where a (or b, respectively) occurs: that is,
Pa = {1, 3, 4} and Pb = {2, 5}.
In general, for a ﬁnite alphabet Σ, we deﬁne the vocabulary σΣ that
contains a binary symbol < and unary symbols Pa for each a ∈ Σ. A string
s ∈ Σ∗
of length n is then represented as a structure Ms ∈ STRUCT[σΣ]
whose universe is {1, . . . , n}, with < interpreted as the order on the natural
numbers, and Pa being the set of positions where the letter a occurs, for each
a in Σ.
Suppose we have a sentence Φ of some logic L, in the vocabulary σΣ. Such
a sentence deﬁnes a language, that is, a subset of Σ∗
, given by
L(Φ) = {s ∈ Σ∗
| Ms |= Φ}. (7.6)
We say that a language L is deﬁnable in a logic L if there exists an L-sentence
Φ such that L = L(Φ).
The following is a fundamental result that connects MSO-deﬁnability and
regular languages.
Theorem 7.21 (B¨uchi). A language is deﬁnable in MSO iﬀ it is regular.
Proof. We start by showing how to deﬁne every regular language L in MSO. If
L is regular, then its strings are accepted by a deterministic ﬁnite automaton
A = (Q, q0, F, δ), where Q = {q0, . . . , qm−1} is the set of states, q0 ∈ Q is
the initial state, F ⊆ Q is the set of ﬁnal states, and δ : Q × Σ → Q is the
transition function. We take Φ to be the MSO sentence
∃X0 . . . ∃Xm−1 ϕpart ∧ ϕstart ∧ ϕtrans ∧ ϕaccept. (7.7)
In this sentence, we are guessing m sets X0, . . . , Xm−1 that correspond to
elements of the universe of Ms (i.e., positions of s) where the automaton A is
in the state q0, q1, . . . , qm−1, respectively, and the remaining three ﬁrst-order
formulae ensure that the behavior of A is simulated correctly. That is:
• ϕpart asserts that X0, . . . , Xm−1 partition the universe of Ms. This is easy
to express in FO:
∀x
m−1
i=0
Xi(x) ∧
j=i
¬Xj(x) .
7.4 MSO on Strings and Regular Languages 125
• ϕstart asserts that the automaton starts in state q0:
∀x
a∈Σ
Pa(x) ∧ ∀y (y ≥ x) → Xδ(q0,a)(x) .
Note some abuse of notation: δ(q0, a) = qi for some i, but we write Xδ(q0,a)
instead of Xi.
• ϕtrans asserts that transitions are simulated correctly:
∀x∀y
m−1
i=0 a∈Σ
(x ≺ y) ∧ Xi(x) ∧ Pa(y) → Xδ(qi,a)(y) ,
where x ≺ y means that y is the successor of x.
• ϕaccepts asserts that at the end of the string, A enters an accepting state:
∀x ∀y (y ≤ x) →
qi∈F
Xi(x) .
Hence, (7.7) captures the behavior of A, and thus L(Φ) = L.
For the converse, let Φ be an MSO sentence in the vocabulary σΣ, and
let k = qr(Φ). Let τ0, . . . , τm enumerate all the rank-k MSO types of σΣ
structures (more precisely, rank-k 0, 0 types, with zero free ﬁrst- and secondorder
variables, or, in other words, sentences).
Let Ψi be an MSO sentence of quantiﬁer rank k deﬁning the type τi. That
is,
Ms |= Ψi ⇔ mso-tpk(Ms) = τi.
Since qr(Φ) = k, the sentence Φ is a disjunction of some of the Ψi’s. We deﬁne
F ⊆ {τ0, . . . , τm} to be the set of types consistent with Φ. Then Φ is equivalent
to τi∈F Ψi.
We further assume that τ0 is the type of Mǫ, where ǫ denotes the empty
string. That is, this is the only type among the τi’s that is consistent with
¬∃x (x = x).
We now deﬁne the automaton
AΦ = ({τ0, . . . , τm}, τ0, F, δΦ), (7.8)
with the set of states S = {τ0, . . . , τm}, the initial state τ0, the set of ﬁnal
states F, and the transition function δΦ : S × Σ → 2S
deﬁned as follows:
τj ∈ δF (τi, a) ⇔ ∃s ∈ Σ∗ mso-tpk(Ms) = τi
and mso-tpk(Ms·a) = τj
. (7.9)
We now claim that the automaton AΦ is deterministic (i.e., for every τi and
a ∈ Σ there is exactly one τj satisfying (7.9)). For that, notice that by a
126 7 Monadic Second-Order Logic and Automata
composition argument similar to that of Lemma 7.11, if s1, s2, t1, t2 ∈ Σ∗
are
such that Ms1 ≡MSO
k Mt1 and Ms2 ≡MSO
k Mt2 , then Ms1·s2 ≡MSO
k Mt1·t2 .
Now suppose that mso-tpk(Ms1 ) = mso-tpk(Ms2 ) = τi. In particular,
Ms1 ≡MSO
k Ms2 . Then Ms1·a ≡MSO
k Ms2·a. Suppose also that we have j1 = j2
such that mso-tpk(Ms1·a) = τj1 and mso-tpk(Ms2·a) = τj2 . Then Ms1·a |= Ψj1 ,
but since Ms2·a |= Ψj2 and qr(Ψj2 ) = k, we obtain Ms1·a |= Ψj2 , which implies
mso-tpk(Ms1·a) = τj2 = τj1 . This contradiction proves that the automaton
(7.8) is deterministic.
Now by a simple induction on the length of the string we prove that for
any string s, after reading s the automaton AΦ ends in the state τi such that
mso-tpk(Ms) = τi. For the empty string, this is our choice of τ0. Suppose now
that mso-tpk(Ms) = τi and AΦ is in state τi after reading s. By the deﬁnition
of the transition function δΦ and the fact that it is deterministic, if AΦ reads
a, it moves to the state τj such that mso-tpk(Ms·a) = τj, which proves the
statement.
Therefore, AΦ accepts a string s iﬀ mso-tpk(Ms) is in F, that is, is consistent
with Φ. The latter happens iﬀ Ms |= Φ, which proves that the language
accepted by AΦ is L(Φ). This completes the proof.
We have seen that over graphs, there are universal MSO-sentences which
are not expressible in ∃MSO. In contrast, over strings every MSO sentence
can be represented by an automaton, and (7.7) shows that the behavior of
every automaton can be captured by an ∃MSO sentence. Hence, we obtain
the following.
Corollary 7.22. Over strings, MSO = ∃MSO.
As an application of Theorem 7.21, we prove a few bounds on the expressive
power of MSO. We have seen before that MSO over the empty vocabulary
cannot express even. What about the power of MSO on linear orderings? Recall
that Ln denotes a linear ordering on n elements. From Theorem 7.21, we
immediately derive the following.
Corollary 7.23. Let X ⊆ N. Then the set {Ln | n ∈ X} is MSO-deﬁnable iﬀ
the language {an
| n ∈ X} is regular.
Thus, MSO can test, for example, if the size of a linear ordering is even,
or – more generally – a multiple of k for any ﬁxed k. On the other hand, one
cannot test in MSO if the cardinality of a linear ordering is a square, or the
kth power, for any k > 1; nor is it possible to test if such a cardinality is a
power of k > 1.
As a more interesting application, we show the following.
Corollary 7.24. It is impossible to test in MSO if a graph is Hamiltonian.
7.5 FO on Strings and Star-Free Languages 127
Proof. Let Kn,m denote the complete bipartite graph on sets of cardinalities
n and m; that is, an undirected graph G whose nodes can be partitioned
into two sets X, Y such that |X| = n, |Y | = m, and the set of edges is
{(x, y), (y, x) | x ∈ X, y ∈ Y }. Notice that Kn,m is Hamiltonian iﬀ n = m.
Assume that Hamiltonicity is deﬁnable in MSO. Let Σ = {a, b}. Given a
string s, we deﬁne, in FO, the following graph over the universe of Ms:
ϕ(x, y) ≡ Pa(x) ∧ Pb(y) ∨ Pb(x) ∧ Pa(y) .
That is, ϕ(Ms) is Kn,m, where n is the number of a’s in s, and m is the
number of b’s. Thus, if Hamiltonicity were deﬁnable in MSO, the language
{s ∈ Σ∗
| the number of a’s in s equals the number of b’s} would have been
a regular language, but it is well known that it is not (by a pumping lemma
argument).
7.5 FO on Strings and Star-Free Languages
Since MSO on strings captures regular languages, what can be said about the
class of languages captured by FO? It turns out that FO corresponds to a
well-known class of languages, which we deﬁne below.
Deﬁnition 7.25. A star-free regular expression over Σ is an expression built
from the symbols ∅ and a, for each a in Σ, using the operations of union (+),
complement (¯), and concatenation (·). Such a regular expression e denotes a
language L(e) over Σ as follows:
• L(∅) = ∅; L(a) = {a} for a ∈ Σ.
• L(e1 + e2) = L(e1) ∪ L(e2).
• L(¯e) = Σ∗
− L(e).
• L(e1 · e2) = {s1 · s2 | s1 ∈ L(e1), s2 ∈ L(e2)}.
A language denoted by a star-free expression is called a star-free language.
Note that some of the regular expressions that use the Kleene star ∗ are actually
star-free, because in the deﬁnition of star-free expressions one can use
the operation of complementation. For example, suppose Σ = {a, b}. Then
(a + b)∗
deﬁnes a star-free language, denoted by the star-free expression ¯∅.
Likewise, e = a∗
b∗
also denotes a star-free language, since it can be characterized
as a language in which there is no b preceding an a. A language with
a b preceding an a can be deﬁned as (a + b)∗
· ba · (a + b)∗
, and hence L(e) is
deﬁned by the star-free expression
¯∅ · b · a · ¯∅.
Theorem 7.26. A language is deﬁnable in FO iﬀ it is star-free.
128 7 Monadic Second-Order Logic and Automata
Proof. We show that every star-free language is deﬁnable in FO by induction
on the star-free expression. The empty language is deﬁnable by false, the
language {a} is deﬁnable by ∃!x (x = x) ∧ ∀x Pa(x). If e = ¯e1 and L(e1) is
deﬁnable by Φ, then ¬Φ deﬁnes L(e). If e = e1 + e2, with L(e1) and L(e2)
deﬁnable by Φ1 and Φ2 respectively, then Φ1 ∨ Φ2 deﬁnes L(e).
Now assume that e = e1 · e2, and again L(e1) and L(e2) are deﬁnable
by Φ1 and Φ2. Let x be a variable that does not occur in Φ1 and Φ2, and let
ϕi(x), i = 1, 2, be the formula obtained from Φ1 by relativizing each quantiﬁer
to the set of positions {y | y ≤ x} for ϕ1, and to {y | y > x} for ϕ2. More
precisely, we inductively replace each subformula ∃yψ of Φ1 by ∃y (y ≤ x)∧ψ,
and each such subformula of Φ2 by ∃y (y > x) ∧ ψ. Then, for a string s and a
position p, we have Ms |= ϕ1(p) iﬀ M≤p
s |= Φ1, where M≤p
s is the substructure
of Ms with the domain {1, . . . , p}. Furthermore, Ms |= ϕ2(p) iﬀ M>p
s |= Φ2,
where M>p
s is the substructure of Ms whose universe is the complement of
{1, . . . , p}. Hence, s ∈ L(e) iﬀ Ms |= ∃x ϕ1(x) ∧ ϕ2(x), which proves that
every star-free language is FO-deﬁnable.
We now prove the other direction: every FO-deﬁnable language is star-free.
For technical reasons (to get the induction oﬀ the ground), we expand σΣ with
a constant max, to be interpreted as the largest element of the universe. Since
max is FO-deﬁnable, this does not aﬀect the set of FO-deﬁnable languages.
The proof is now by induction on the quantiﬁer rank k of a sentence Φ.
Note that since star-free languages are closed under the Boolean operations,
an arbitrary Boolean combination of sentences deﬁning star-free languages
also deﬁnes a star-free language.
For k = 0, we have Boolean combinations of the sentences of the form
Pa(max), as well as true and false. The sentence Pa(max) deﬁnes the language
denoted by ¯∅ · a, true deﬁnes L(¯∅), and false deﬁnes L(∅).
Given the closure under Boolean combinations, for the inductive step it
suﬃces to consider sentences Φ = ∃xϕ(x), where qr(ϕ) = k.
Let τ0, . . . , τm enumerate all the rank-k FO-types (again, with respect to
sentences: we do not have free variables). We deﬁne
SΦ = (τi, τj)
for some s and a position p, Ms |= ϕ(p),
tpk(M≤p
s ) = τi and tpk(M>p
s ) = τj
.
Our goal is now to show the following: for every string u, Mu |= Φ iﬀ there
exists a position p in u such that for some (τi, τj) in SΦ, we have
tpk(M≤p
u ) = τi and tpk(M>p
u ) = τj. (7.10)
First, we notice that this claim implies that the language L(Φ) is star-free.
Indeed, each of τi is deﬁnable by an FO sentence Ψi of quantiﬁer rank k, and
hence by the induction hypothesis, each language L(Ψi) is star-free. Thus,
L(Φ) =
(τi,τj)∈SΦ
L(Ψi) · L(Ψj).
7.6 Tree Automata 129
That is, L(Φ) is a union of concatenations of star-free languages, and hence
it is star-free.
If Mu |= Φ, then the existence of p and a pair (τi, τj) follows from the
deﬁnition of SΦ. Conversely, suppose we have a string u and a position p such
that (7.10) holds. Since (τi, τj) ∈ SΦ, we can ﬁnd a string s with a position p′
in it such that Ms |= ϕ(p′
), tpk(M≤p′
s ) = τi, and tpk(M>p
s ) = τj. Hence,
M≤p
u ≡k M≤p′
s , M>p
u ≡k M>p′
s ,
and thus (Mu, p) ≡k (Ms, p′
). Since qr(ϕ) = k, it follows that Mu |= ϕ(p),
and hence Mu |= Φ, as claimed. This completes the proof.
Corollary 7.27. There exist regular languages which are not star-free.
Proof. The language denoted by (aa)∗
is regular, but clearly not star-free,
since even is not FO-deﬁnable over linear orders.
7.6 Tree Automata
We now move from strings to trees. Our goal is to deﬁne trees as ﬁrst-order
structures, and study MSO over them. We shall connect MSO with the notion
of tree automata. Tree automata play an important role in many applications,
including rewriting systems, automated theorem proving, veriﬁcation, and
recently database query languages, especially in the XML context.
We consider two kinds of trees in this section. Ranked trees have the property
that every node which is not a leaf has the same number of children (in
fact we shall ﬁx this number to be 2, but all the results can be generalized
to any ﬁxed k > 1). On the other hand, in unranked trees diﬀerent nodes can
have a diﬀerent number of children. We shall start with ranked (binary) trees.
Deﬁnition 7.28. A tree domain is a subset D of {1, 2}∗
that is preﬁx-closed;
that is, if s ∈ D and s′
is a preﬁx of D, then s′
∈ D. Furthermore, if s ∈ D,
then either both s · 1 and s · 2 are in D, or none of them is in D.
A Σ-tree T is a pair (D, f) where D is a tree domain and f is a function
from D to Σ (the labeling function).
We refer to the elements of D as the nodes of T . Every nonempty tree
domain has the node ǫ, which is called the root. A node s such that s·1, s·2 ∈ D
is called a leaf.
The ﬁrst tree in Fig. 7.2 is a binary tree. We show both the nodes and the
labeling in that picture. The nodes 111, 112, 12, 21, 22 are the leaves.
We represent a tree T = (D, f) as a ﬁrst-order structure
MT = D, ≺, (Pa)a∈Σ, succ1, succ2
130 7 Monadic Second-Order Logic and Automata
1
11
111 112
12
2
21 22
ǫ a
a
b b
a b
b
a a
ǫ
1
11
111 113
112
2
21 22
3
31 32
33
331
Fig. 7.2. Examples of a ranked and an unranked tree
of vocabulary σΣ expanded with two binary relations succ1 and succ2. Here
≺ is interpreted as the preﬁx relation on D (in particular, it is a partial order,
rather than a linear order, as was the case with strings), Pa is interpreted as
{s ∈ D | f(s) = a}, and succi is {(s, s · i) | s, s · i ∈ D}, for i = 1, 2.
We let Trees(Σ) be the set of all Σ-trees. If we have a sentence Φ of some
logic, it deﬁnes the set of trees (also called a tree language)
LT (Φ) = {T ∈ Trees(Σ) | MT |= Φ}.
Thus, we shall be talking about tree languages deﬁnable in various logics.
Deﬁnition 7.29 (Tree automata and regular tree languages). A (nondeterministic)
tree automaton is a tuple A = (Q, q0, δ, F), where Q is a ﬁnite
set of states, q0 ∈ Q, F ⊆ Q is the set of ﬁnal (accepting) states, and
δ : Q × Q × Σ → 2Q
is the transition function.
Given a tree T = (D, f), a run of A on T is a function r : D → Q such
that
• if s is a leaf labeled a, then r(s) ∈ δ(q0, q0, a);
• if r(s · 1) = q, r(s · 2) = q′
and f(s) = a, then r(s) ∈ δ(q, q′
, a).
A run is called successful if r(ǫ) ∈ F (the root is in the accepting state). The
set of trees accepted by A is the set of all trees T for which there exists a
successful run.
A tree language is called regular if it is accepted by a tree automaton.
7.6 Tree Automata 131
In a deterministic tree automaton, the transition function is δ : Q × Q ×
Σ → Q, and the deﬁnition of a run is modiﬁed as follows:
• if s is a leaf labeled a, then r(s) = δ(q0, q0, a);
• if r(s · 1) = q, r(s · 2) = q′
and f(t) = a, then r(s) = δ(q, q′
, a).
For example, consider a deterministic tree automaton A whose set of states
is {q0, qa, qb, q, q′
}, with F = {q′
}, and the transition function has the follow-
ing:
δ(q0, q0, a) = qa
δ(q0, q0, b) = qb
δ(qa, qb, b) = q
δ(qa, qa, b) = q′
δ(q, qb, a) = q
δ(q, q′
, a) = q′
.
Then this automaton accepts the ranked tree shown in Fig. 7.2: following the
deﬁnition of the transition function, we deﬁne the run r such that:
• for the leaves, r(111) = r(21) = r(22) = qa and r(112) = r(12) = qb;
• r(11) = δ(qa, qb, b) = q;
• r(1) = δ(q, qb, a) = q;
• r(2) = δ(qa, qa, b) = q′
; and ﬁnally,
• r(ǫ) = δ(q, q′
, a) = q′
, and since q′
∈ F, the automaton accepts.
We now establish the analog of Theorem 7.21 for trees, by showing that
regular tree languages are precisely those deﬁnable in MSO.
Theorem 7.30. A set of trees is deﬁnable in MSO iﬀ it is regular.
Proof. The proof is similar to that of Theorem 7.21. To ﬁnd an MSO deﬁnition
of the tree language accepted by an automaton A, we guess, for each state
q, the set Xq of nodes where the run of A is in state q, and then check, in
FO, that each leaf labeled a is in Xq for some q ∈ δ(q0, q0, a), that transitions
are modeled properly, and that the root is in one of the accepting states. The
sentence looks very similar to (7.7), and is in fact an ∃MSO sentence.
The proof of the converse, i.e., that MSO only deﬁnes regular languages,
again follows the proof in the string case. Suppose an MSO sentence Φ of
quantiﬁer rank k is given. We let τ0, . . . , τm enumerate all the rank-k MSO
types, with τ0 being the type of the empty tree, and take {τ0, . . . , τm} as the
set of states of an automaton AΦ. Since Φ is equivalent to a disjunction of
types, we let F = {τi | τi is consistent with Φ}. Finally,
τl ∈ δ(τi, τj, a)
132 7 Monadic Second-Order Logic and Automata
a
T1 T2
τi τj
Fig. 7.3. Illustration for the proof of Theorem 7.30
if there are trees T1 and T2 whose rank-k MSO types are τi and τj, respectively,
such that the rank-k MSO type of the tree obtained by hanging T1 and T2 as
children of a root node labeled a is τl (see Fig. 7.3).
Again, similarly to the proof of Theorem 7.21, one can show that AΦ is a
deterministic tree automaton accepting the tree language {T | T |= Φ}.
Corollary 7.31. Every tree automaton is equivalent to a deterministic tree
automaton, and every MSO sentence over trees is equivalent to an ∃MSO
sentence.
The connection between FO-deﬁnability and star-free languages does not,
however, extend to trees. There are several interesting logics between FO and
MSO, and some of them will be introduced in exercises.
We next show how to extend these results to unranked trees.
Deﬁnition 7.32 (Unranked trees). An unranked tree domain is a subset
D of {1, 2, . . .}∗
(ﬁnite words over positive integers) that is preﬁx-closed, and
such that for s · i ∈ D and j < i, the string s · j is in D as well. An unranked
tree is a pair (D, f), where D is an unranked tree domain, and f is the labeling
function f : D → Σ.
Thus, a node in an unranked tree can have arbitrarily many children. An
example is shown in Fig. 7.2 (the second tree). Some nodes – the root, nodes
11 and 3 – have three children; some have two (node 2), some have one (nodes
1 and 33).
The transition function for an automaton working on binary trees was
of the form δ : Q × Q × Σ → Q, based on the fact that each nonleaf node
has exactly two children. In an unranked tree, the number of children could
be arbitrary. The idea of extending the notion of tree automata to the unranked
case is then as follows: we have additional string automata that run on
the children of each node, and the acceptance conditions of those automata
determine the state of the parent node. This is formalized in the deﬁnition
below.
7.7 Complexity of MSO 133
Deﬁnition 7.33 (Unranked tree automata). An unranked tree automaton
is a triple A = (Q, q0, δ), where as before Q is the set of states, q0 is an
element of Q, and δ is the transition function δ : Q × Σ → 2Q∗
such that
δ(q, a) is a regular language over Q for every q ∈ Q and a ∈ Σ.
Given an unranked tree T = (D, f), a run of A on T is deﬁned as a
function r : D → Q such that the following holds:
• if s is a node labeled a, with children s · 1, . . . , s · n, then the string
r(s · 1)r(s · 2) . . . r(s · n) is in δ(r(s), a).
In particular, if s is a leaf, then r(s) = q implies that the empty string belongs
to δ(q, a).
A run is successful if r(ǫ) = q0, and T is accepted by A if there exists an
accepting run. An unranked tree language L is called regular if it is accepted
by an unranked tree automaton.
To connect regular languages with MSO-deﬁnability, we have to represent
unranked trees as structures. It is no longer suﬃcient to model just two successor
relations, since a node can have arbitrarily many successors. Instead,
we introduce an ordering on successor relations. That is, an unranked tree
T = (D, f) is represented as a structure
D, ≺, (Pa)a∈Σ, <sibl , (7.11)
where ≺, as before, is the preﬁx relation, Pa is interpreted as {s ∈ D | f(s) =
a}, and s′
<sibl s′′
iﬀ there is a node s and i, j ∈ N, i < j, such that s′
= s · i,
s′′
= s · j. In other words, s′
and s′′
are siblings, and s′
precedes s′′
.
Thus, when we talk about FO-deﬁnability, or MSO-deﬁnability over unranked
trees, we mean deﬁnability over structures of the form (7.11).
Finally, the connection between automata and MSO-deﬁnability extends
to unranked trees.
Theorem 7.34. An unranked tree language is regular iﬀ it is MSO-deﬁnable.
The proof of this theorem is similar in spirit to the proofs of Theorems
7.21 and 7.30, and is left as an exercise for the reader.
7.7 Complexity of MSO
In this section we study complexity of MSO. We have seen that MSO, and
even ∃MSO, are signiﬁcantly more expressive than FO: ∃MSO can express
NP-complete problems (3-colorability, for example), and by using negation,
we can express coNP-complete problems in MSO.
134 7 Monadic Second-Order Logic and Automata
This suggests a close connection between MSO and the polynomial hierarchy,
PH, for which NP and coNP are the two lowest levels above polynomial
time. Recall that the levels of the polynomial hierarchy are deﬁned
as Σp
0 = Πp
0 = Ptime, Σp
i = NPΣp
i−1 , and Πp
i is the set of problems whose
complement is in Σp
i (see Sect. 2.3).
We next show that the data complexity of MSO is well approximated by
the polynomial hierarchy (although MSO does not capture PH: for example,
Hamiltonicity is not MSO-expressible).
Proposition 7.35. For each level Σp
i or Πp
i of the polynomial hierarchy, there
exists a problem complete for that level which is expressible in MSO.
Proof. We show how to express a variant of QBF (quantiﬁed Boolean formulae),
which we used in the proof of Pspace-completeness of the combined
complexity of FO. We deﬁne the problem Σi-SAT as follows. Its input is a
formula of the form
(∃ . . . ∃)(∀ . . . ∀)(∃ . . . ∃) . . . ϕ, (7.12)
where ϕ is a propositional Boolean formula in conjunctive normal form, such
that each conjunct contains three propositional variables. The quantiﬁer preﬁx
starts with a block of existential quantiﬁers, followed by a block of universal
quantiﬁers, followed by a block of existential quantiﬁers, and so on – such that
there are i blocks of quantiﬁers. The output is “yes” if the formula (7.12) is
true.
The problem Πi-SAT is deﬁned similarly, except that in (7.12), the ﬁrst
block of quantiﬁers is universal. We use the known fact that Σi-SAT is complete
for Σp
i , and Πi-SAT is complete for Πp
i .
We now show how to encode an instance Φ (7.12) of Σi-SAT as a structure
AΦ. Its universe is the set of variables used in (7.12). It has four binary relations
R0, R1, R2, R3, and i + 1 unary relations E1, U2, E3, . . .. Each relation
Ek or Uk is interpreted as the set of variables quantiﬁed by the kth block of
quantiﬁers. Relations R0, R1, R2, R3 encode the formula ϕ: relation Ri corresponds
to all the conjuncts of ϕ in which exactly i variables appear positively.
That is, R0 has all the triples (x, y, z) such that (¬x ∨ ¬y ∨ ¬z) is a conjunct
of ϕ, R1 has all the triples (x, y, z) such that (x ∨ ¬y ∨ ¬z) is a conjunct of ϕ,
and so on.
Next, we ﬁnd an MSO sentence Ψ such that AΦ |= Ψ iﬀ Φ is true. This
sentence is of the form
∃X1 ⊆ E1∀X2 ⊆ U2∃X3 ⊆ E3 . . . ϕ′
,
where each Xi corresponds to the set of variables set to true in the ith quantiﬁer
block. The formula ϕ′
says that the variable assignment of the quantiﬁer
preﬁx of Ψ makes ϕ true. For example, for each triple (x, y, z) in R1, it would
state that either y or z belongs to some of the Xi’s, or x belongs to neither
7.7 Complexity of MSO 135
of them, and similarly for R0, R2, and R3. We leave the details to the reader.
The proof for Πi-SAT is almost identical: the sentence Φ must start with a
universal MSO quantiﬁer.
We shall return to complexity of SO in Chap. 9. For the combined complexity
of MSO, see Exercise 7.21.
Even though the complexity of MSO is quite high, for many interesting
structures, in particular, strings and trees, the connection with automata provides
nice bounds in terms of parameterized complexity.
Corollary 7.36. Over strings and trees (ranked and unranked), evaluating
MSO sentences is ﬁxed-parameter linear. In particular, over strings and trees,
the data complexity of MSO is linear.
Proof. Suppose we have a sentence Φ and a structure A (string or tree). We
convert Φ into a deterministic automaton, by Theorems 7.21, 7.30, and 7.34,
and run that automaton over A, which takes linear time.
Can Corollary 7.36 be extended to a larger class of structures? The answer
to this is positive, and it uses the concept of bounded treewidth we ﬁrst
encountered in Section 6.7. Recall that a class C of σ-structures is said to be
of bounded treewidth if there is a number k such that for every A ∈ C, the
treewidth of its Gaifman graph is at most k. (See Sect. 6.7 for the deﬁnition
of treewidth.)
Theorem 7.37 (Courcelle). Let C be a class of structures of bounded
treewidth. Then evaluating MSO sentences over C is ﬁxed-parameter linear.
In particular, the data complexity of MSO over C is linear.
Proof sketch. We outline the idea of the proof. For simplicity, assume that
our input structures are graphs. Given a graph G, compute, in linear time, its
tree decomposition, consisting of a tree T and a set Bt for each node t of T .
Since the treewidth is ﬁxed, say k, each Bt is of size at most k + 1, and thus
all the graphs generated by Bt’s can be explicitly enumerated. This allows
us to express MSO quantiﬁcation over the original graph G in terms of MSO
quantiﬁcation over T . Thus, we are now in the setting where MSO sentences
have to be evaluated over trees, and this problem is ﬁxed-parameter linear,
which can be shown by converting MSO sentences into tree automata, as in
Corollary 7.36.
Fixed-parameter linearity implies that the complexity of the modelchecking
is of the form f( Φ )· A . What can be said about the function f?
Even over strings, to achieve ﬁxed-parameter linearity, one has to convert Φ to
an automaton. We have seen this conversion in the proof of Theorem 7.21, and
it was based on computing all rank-k MSO-types. One can also convert MSO
sentences into automata directly, with existential quantiﬁers corresponding to
136 7 Monadic Second-Order Logic and Automata
nondeterministic guesses. For such a conversion, the main problem is negation,
since complementing nondeterministic automata is not easy: one has to
make them deterministic ﬁrst, and then reverse the roles of accepting and
rejecting states. Going from a nondeterministic automaton to a deterministic
one entails an exponential blow-up.
When we try to apply this reasoning to an MSO sentence of the form
(∃ . . . ∃)(∀ . . . ∀)(∃ . . . ∃) . . . ϕ,
we see that at each quantiﬁer alternation, one needs to make the automaton
deterministic. Hence, the size of the resulting automaton will be bounded by
(roughly)
22...n ﬀ
k times
,
where n is the size of the automaton corresponding to ϕ, and k is the number
of alternations of quantiﬁers. That is, the size of the automaton is actually
nonelementary in terms of Φ . We recall that a function f : N → N is
elementary if for some ﬁxed l,
f(n) < 22...n ﬀ
l times
for all n.
In fact, it is known that converting MSO formulae into automata is inherently
nonelementary. Thus, even though over some classes of structures MSO is
ﬁxed-parameter linear, the function of the parameter (that depends on the
MSO sentence) is extremely large. Exercise 7.22 shows that the complexity
cannot be lowered unless NP collapses to Ptime.
7.8 Bibliographic Notes
Second-order logic is described in most logic textbooks. Monadic second order
logic and its games can be found in Ebbinghaus and Flum [60].
Proposition 7.14 is from Fagin [71], the proof is from Fagin, Stockmeyer,
and Vardi [76]. Expressibility of undirected reachability in ∃MSO is due to
Kanellakis; the proof was published in [11]. Inexpressibility of directed reachability
in ∃MSO is due to Ajtai and Fagin [11].
The Fagin game is from Fagin [71], and the Ajtai-Fagin game is from [11].
For additional results on ∃MSO and its relatives, see a survey by Fagin [75]
and Ajtai, Fagin, and Stockmeyer [12], Janin and Marcinkowski [137], and
Schwentick [216].
Theorem 7.21 is due to B¨uchi [27], and the proof presented here follows
Ladner [160], see also Neven and Schwentick [187]. Corollary 7.24 is due to
Tur´an [236] and de Rougemont [56]; the proof here follows Makowsky [176].
Theorem 7.26 was proved by McNaughton and Papert [182]; the proof
based on games follows Thomas [233].
7.9 Exercises 137
For connections between automata theory, logical deﬁnability, and circuit
complexity, see Straubing [225].
In the proofs of Proposition 7.12 and Theorems 7.21 and 7.26 we used
the composition method already encountered in Chap. 3. The composition
techniques used here are a special case of the Feferman-Vaught Theorem [79].
For more on the composition method, see a recent survey by Makowsky [177],
as well as Exercises 7.25 and 7.26.
Tree automata are treated in several books and surveys [38, 90]. Theorem
7.30 is due to Thatcher and Wright [230]. The corresponding result for unranked
trees seems to be a part of folklore, and can be found in several papers
dealing with querying XML, e.g., Neven [186].
Proposition 7.35 also appears to be folklore. Completeness of Σp
i -SAT and
Πp
i -SAT is due to Stockmeyer [223] (it is also known that the quantiﬁer-free
formula can always be taken to be 3-CNF [59]). Theorem 7.37 was proved by
Courcelle [44]. Linearity of ﬁnding a tree decomposition of small treewidth is
from Bodlaender [24]. The nonelementary complexity of the translation from
MSO to automata is due to Stockmeyer and Meyer [224].
Sources for exercises:
Exercises 7.2–7.5: Courcelle [45, 46]
Exercise 7.11: Schwentick [215]
Exercise 7.12: Cosmadakis [42]
Exercise 7.13: Otto [190]
Exercise 7.14: Matz, Schweikardt, and Thomas [180]
Exercise 7.16: Thomas [231]
Exercises 7.18 and 7.19: Thomas [232, 233]
Exercise 7.20: Blumensath and Gr¨adel [23]
Bruy`ere et al. [26], Benedikt et al. [21]
Benedikt and Libkin [20]
Exercise 7.22: Frick and Grohe [85]
Exercise 7.23: Grandjean and Olive [105]
Schwentick [216], Lynch [174]
Exercise 7.25: Makowsky [177]
Exercise 7.26: Courcelle and Makowsky [47]
Exercise 7.27: Seese [218]
7.9 Exercises
Exercise 7.1. Prove Theorem 7.7.
Exercise 7.2. Prove that the following properties of an undirected graph G are
expressible in MSO:
• G is planar;
• G is a tree.
138 7 Monadic Second-Order Logic and Automata
Exercise 7.3. Prove that the following properties of an undirected graph G are
expressible in ∃MSO:
• G is not planar;
• G is not a tree;
• G is not chordal (recall that a chord of a cycle C is an edge (a, b) such that a, b
are nodes of C, but the edge (a, b) is not in C; a graph without loops is chordal
if it has no cycle of length at least 4 without a chord).
Exercise 7.4. Consider a diﬀerent representation of graphs as ﬁrst-order structures.
Given a graph G, we create a structure AG = AG, PG whose universe is the disjoint
union of nodes and edges of G, and PG is a ternary relation that consists of pairs
(a, e, b), where e is the edge (a, b) in G.
Prove that over such a representation of graphs, Hamiltonicity is MSO-deﬁnable.
Exercise 7.5. Corollary 7.24 and Exercise 7.4 show that the expressive power of
MSO is diﬀerent over two representation of graphs: one with the universe consisting
of nodes, and the other one with the universe consisting of both nodes and edges.
Prove that if we restrict the class of graphs to be one of the following:
• graphs of bounded degree, or
• planar graphs, or
• graphs of treewidth at most k, for a ﬁxed k,
then the expressive power of MSO over the two diﬀerent representations of graphs
is the same.
Exercise 7.6. Prove that ∃MSO can express queries that are not Gaifman-local
and violate the BNDP.
Exercise 7.7. Prove that for each ﬁxed k, directed reachability is expressible in
∃MSO over graphs whose in-degrees and out-degrees do not exceed k.
Exercise 7.8. Prove Theorem 7.16.
Conclude that undirected reachability is in ∀MSO ∩ ∃MSO, while directed
reachability is in ∀MSO − ∃MSO.
Exercise 7.9. Prove Proposition 7.20.
Exercise 7.10. Use Ajtai-Fagin games to prove that there is no ∃MSO sentence Φ
such that, if a graph G is a disjoint union of two cycles, then G |= Φ iﬀ the cycles
are of the same size.
Exercise 7.11. Prove that graph connectivity is not deﬁnable in ∃MSO+ <.
Exercise 7.12. Prove that non-3-colorability of graphs cannot be expressed in
∃MSO.
Exercise 7.13. Prove that the number of second-order quantiﬁers in ∃MSO gives
rise to a strict hierarchy.
Exercise 7.14. Prove that the alternation depth of second-order quantiﬁers in MSO
gives rise to a strict hierarchy.
7.9 Exercises 139
Exercise 7.15. Prove the composition result used in the proof of Theorem 7.21.
That is, if s1, s2, t1, t2 ∈ Σ∗
are such that Ms1 ≡MSO
k Mt1 and Ms2 ≡MSO
k Mt2 , then
Ms1·s2 ≡MSO
k Mt1·t2 .
Exercise 7.16. Prove that over strings, every MSO sentence is equivalent to an
∃MSO sentence with a single second-order quantiﬁer.
Exercise 7.17. Complete the proof of Theorem 7.30, and prove Theorem 7.34.
Exercise 7.18. Consider a restriction of MSO on binary trees, in which we only
allow second-order quantiﬁcations over antichains: sets of nodes X such that for
s, s′
∈ X, s = s′
, neither s ≺ s′
nor s′
≺ s holds. Such a logic is called the antichain
logic.
Prove that every regular tree language is deﬁnable in the antichain logic.
Exercise 7.19. Next, consider a restriction of MSO on binary trees, in which we
only allow second-order quantiﬁcations over chains: sets of nodes X such that for
s, s′
∈ X, s = s′
, either s ≺ s′
or s′
≺ s holds.
Prove that there are regular tree languages that are not deﬁnable in this restriction
of MSO.
Exercise 7.20. Let s1, . . . , sn ∈ Σ∗
. We construct a string [s] over the alphabet
(Σ ∪ {#})n
, whose length is the maximum of the lengths of sj’s, and whose ith
symbol is a tuple (c1, . . . , cn), where each ck is the ith symbol of sk, if the length
of sk is at least i, or # otherwise. We say that a set S ⊆ (Σ∗
)n
is regular if the set
{[s] | s ∈ S} ⊆ (Σ ∪ {#})n
is regular.
Consider the inﬁnite ﬁrst-order structure M whose universe is Σ∗
, and the predicates
include ≺ (the preﬁx relation), a unary predicate La for each a in Σ, such
that La(x) holds iﬀ the last symbol of x is a, and a binary predicate el such that
el(s, s′
) holds iﬀ the length of s equals the length of s′
.
We call a subset S of (Σ∗
)n
deﬁnable in M if there is an FO formula ϕ(x1, . . . , xn)
in the vocabulary of M such that S = {s | M |= ϕ(s)}.
Prove the following:
(a) A subset of (Σ∗
)n
is deﬁnable in M iﬀ it is regular.
(b) A subset of Σ∗
is deﬁnable in M by a formula that does not mention the equal
length predicate iﬀ it is star-free.
(c) Generalize (a) to binary trees.
Exercise 7.21. Prove that the combined complexity of MSO is Pspace-complete.
Exercise 7.22. Prove that if the model-checking problem for MSO on strings can
be solved in time f( Φ ) · p(|s|), for a polynomial p and an elementary function f,
then Ptime = NP.
Exercise 7.23. Deﬁne complexity class NLin as the class of problems accepted by
nondeterministic RAMs in linear time. Consider a diﬀerent encoding of strings as
ﬁnite structures. A string s = s1 . . . sn ∈ {0, 1}∗
is encoded as follows. Partition s
into m pieces of length 1
2
·log n, where m = ⌈ 2n
log n
⌉. Let gs(i) be the number encoded
by the ith piece of the partitioned string. We deﬁne a structure Ms whose universe
is {1, . . . , m}, and the vocabulary consists of two unary functions, one interpreted
as gs, and the other one as successor.
Prove the following:
140 7 Monadic Second-Order Logic and Automata
• A set of strings S is in NLin iﬀ there exists a sentence Φ of the form
∃F1 . . . ∃Fm ∀x ϕ,
where Fi’s are unary function symbols, and ϕ is quantiﬁer-free, such that S =
{s | Ms |= Φ}.
• Every set of strings in NLin is deﬁnable by an ∃MSO sentence in the presence
of a built-in addition relation.
Exercise 7.24. Using the fact that the MSO theory of ﬁnite trees is decidable
(Rabin [202]), prove that the MSO theory of ﬁnite forests is decidable.
Exercise 7.25. Deﬁne Thk
MSO(A) as the set of all MSO[k] sentences true in A.
Notice that Thk
MSO(A) is a ﬁnite object.
We call an m-ary operation F on structures of the same vocabulary MSO-smooth
if Thk
MSO(F(A1, . . . , Am)) is uniquely determined by, and can be computed from,
Thk
MSO(A1), . . . , Thk
MSO(Am), for every k. Prove that the disjoint union of structures,
root joining of trees, and concatenation of words are MSO-smooth.
Exercise 7.26. A class C of structures is MSO-inductive if it is the smallest class of
structures that can be constructed from a ﬁxed ﬁnite set of structures using a ﬁxed
ﬁnite set of MSO-smooth operations. Such a construction naturally yields, for each
structure A ∈ C, its parse tree TA.
Prove that the following are MSO-inductive classes of structures:
• words;
• forests;
• graphs of treewidth at most l, for a ﬁxed l.
Also prove that for a ﬁxed MSO sentence Φ, checking whether A |= Φ can be
solved in time linear in the size of TA, if A ∈ C.
Exercise 7.27. Consider representation of graphs from Exercise 7.4. Prove that if
C is a class of ﬁnite graphs, and its MSO theory in that representation is decidable,
then C is of bounded treewidth.
Hint: you will have to use decidability of the MSO theory of graphs of bounded
treewidth, undecidability of the MSO theory of grids (Cartesian products of successor
relations), and the fact, due to Robertson and Seymour [204], that a class of
graphs of unbounded treewidth has arbitrarily large grids as its minors.
Exercise 7.28.∗
Is every query in (MSO+ <)inv deﬁnable in the expansion of
MSO with unary generalized quantiﬁers (see the deﬁnition in the next chapter)
Qmx ϕ(x, y) such that A |= Qmx ϕ(x, a) holds iﬀ |ϕ(A, a)| mod m = 0?
8
Logics with Counting
We continue dealing with extensions of ﬁrst-order logic. We have seen that the
expressive power of FO on ﬁnite structures is limited in a number of ways: it
cannot express counting properties, nor is it capable of expressing properties
that require iterative algorithms, as those typically violate locality.
In this chapter we address FO’s inability to count. As we saw earlier,
nontrivial properties of cardinalities are not expressible in FO: for example,
a sentence of quantiﬁer rank n cannot distinguish any two linear orders of
cardinality over 2n
. Comparisons of cardinalities, such as testing if |A|>|B |,
are inexpressible too.
We ﬁrst introduce two possible ways of extending FO that add counting
power to it: one is to use counting quantiﬁers and two-sorted structures, the
other is to use generalized unary quantiﬁers. We shall mostly concentrate on
counting quantiﬁers, as unary quantiﬁers can be simulated with them. We
shall see a very powerful counting logic, expressing arbitrary properties of
cardinalities, and yet we show that this logic is local. We also address the
problem of complexity of some of the counting extensions of FO.
8.1 Counting and Unary Quantiﬁers
Suppose we want to ﬁnd an extension of FO capable of expressing the parity
query: if U is a unary predicate in the vocabulary σ, and A ∈ STRUCT[σ], is
|UA
| even? How can one do it?
One approach is to add enough expressiveness to the logic to ﬁnd cardinalities
of some sets: for example, sets deﬁnable by other formulae. Thus, if we
have a formula ϕ(x), we want to ﬁnd the cardinality of ϕ(A) = {a | A |= ϕ(a)}.
The problem is that | ϕ(A) | is a number, and hence the logic must be adequately
equipped to deal with numbers. To be able to use |ϕ(A)|, we introduce
counting quantiﬁers:
∃ix ϕ(x)
142 8 Logics with Counting
is a formula with a new free variable i, which states that there are at least i
elements a of A such that ϕ(a) holds.
The variable i must range over some numerical domain (which, as we shall
see, is diﬀerent for diﬀerent counting logics). On that numerical domain, we
should have some arithmetic operations available (e.g., addition and multiplication),
as well as quantiﬁcation over it, so that sentences in the logic could
be formed.
Without yet giving a formal deﬁnition of the logic that extends FO with
counting quantiﬁers, we show, as an example, how parity is deﬁnable in it:
∃i∃j (i = j + j) ∧ ∃ixϕ(x) ∧ ∀k (k > i) → ¬∃kx ϕ(x) .
This sentence says that we can ﬁnd an even number i (since it is of the form
2j) such that exactly i elements satisfy ϕ(x): that is, at least i elements satisfy
ϕ, and for every k > i, we cannot ﬁnd k elements that satisfy ϕ.
Note that we really have two diﬀerent kinds of variables: variables that
range over the domain of A, and variables that range over some numerical
domain. Such a logic is called two-sorted. Formally, a structure for such a
logic has two universes: one is the non-numerical universe (we shall normally
refer to it as ﬁrst-sort universe) and the numerical, second-sort universe. We
now give the formal deﬁnition of the logic FO(Cnt).
Deﬁnition 8.1 (FO with counting). Given a vocabulary σ, a σ-structure
for FO with counting, FO(Cnt), is a structure of the form
{a0, . . . , an−1}, {0, . . ., n − 1}, (Ri)A
, +, ×, min, max
where {a0, . . . , an−1}, (Ri)A
is a structure from STRUCT[σ] (Ri ranges over
the symbols in σ), + and × are ternary relations {(i, j, k) | i + j = k} and
{(i, j, k) | i·j = k} on {0, . . . , n−1}, min denotes 0 and max denotes n−1. We
shall assume that the universes {a0, . . . , an−1} and {0, . . . , n−1} are disjoint.
Formulae of FO(Cnt) can have free variables of two sorts, ranging over the
two universes. We normally use i, j, k, ı,  for second-sort variables. FO(Cnt)
extends the deﬁnition of FO by the following rules:
• min, max are terms of the second sort. Also, every second-sort variable i
is a term of the second sort.
• If t1, t2, t3 are terms of the second sort, then +(t1, t2, t3) and ×(t1, t2, t3)
are formulae (which we shall normally write as t1+t2 = t3 and t1·t2 = t3).
• If ϕ(x, ı) is a formula, then ∃i ϕ(x, ı) is a formula. The quantiﬁer ∃i binds
the second-sort variable i.
• If ϕ(y, x, ı) is a formula, then ψ(x, i, ı) ≡ ∃iyϕ(y, x, ı) is a formula. The
quantiﬁer ∃iy binds the ﬁrst-sort variable y but not the second-sort variable
i.
8.1 Counting and Unary Quantiﬁers 143
For the semantics of this logic, only the last item needs explanation. Suppose
we have a structure A, and we ﬁx an interpretation a for x (from
{a0, . . . , an−1}), ı0 for i, and i0 for i (from {0, . . . , n − 1}). Then A |=
ψ(a, i0, ı0) iﬀ
|{b ∈ {a0, . . . , an−1} | A |= ϕ(b, a, ı0)}| ≥ i0.
If we have a σ-structure A, there is a two-sorted structure A′
naturally
associated with A. Assuming A = {a0, . . . , an−1}, we let the numerical domain
of A′
be {0, . . . , n − 1}, with min and max interpreted as 0 and n − 1, and
+ and × getting their usual interpretations. Hence, for A ∈ STRUCT[σ], we
shall write A |= ϕ whenever ϕ is an FO(Cnt) formula, instead of the more
formal A′
|= ϕ.
Let us see a few examples of deﬁnability in FO(Cnt). First, the usual linear
ordering on numbers is deﬁnable: i ≤ j iﬀ ∃k (i + k = j). Note that this does
not imply deﬁnability of ordering on the ﬁrst-sort universe; in fact we shall
see that with such an ordering, FO(Cnt) is more powerful than FO(Cnt) on
unordered ﬁrst-sort structures (similarly to the case of FO, shown in Theorem
5.3, and MSO, shown in Corollary 7.13).
We can deﬁne a formula ∃!ixϕ(x, · · · ) saying that there are exactly i elements
satisfying ϕ:
∃!ixϕ(x, · · · ) ≡ ∃ixϕ(x, · · · ) ∧ ∀k (k > i) → ¬∃kxϕ(x, · · · ) .
We can also compare cardinalities of two sets. Suppose we have two formulae
ϕ(x) and ψ(x); to test if |ϕ(A)|>|ψ(A)|, one could write
∃i ∃ixϕ(x) ∧ ¬∃ixψ(x) .
One can also write a formula for the majority predicate MAJ(ϕ, ψ) testing if
the set ϕ(A) contains at least half of the set ψ(A):
∃i∃j (∃!ix(ϕ(x) ∧ ψ(x))) ∧ (∃!jxψ(x)) ∧ (i + i ≥ j) .
Note that the deﬁnition of FO(Cnt) allows us to use formulae of the form
t1(ı) {=, >, ≥} t2(ı), where t1 and t2 are terms. For example, (i + i ≥ j) is
∃k (k = i + i ∧ k ≥ j).
We now present another way of adding counting power to FO that does
not involve two-sorted structures. Suppose we want to state that | ϕ(A) | is
even. We deﬁne a new quantiﬁer, Qeven, that binds one variable, and write
Qevenx ϕ(x). In fact, more generally, for a formula with several free variables
ϕ(x, y), we can construct a new formula Qevenx ϕ(x, y), with free variables y.
Its semantics is deﬁned as follows. If a is the interpretation for y, then
A |= Qevenx ϕ(x, a) ⇔ |{b | A |= ϕ(b, a)}| mod 2 = 0.
144 8 Logics with Counting
Using the same approach, we can do cardinality comparisons. For example,
let QH be a quantiﬁer that binds two variables; then for two formulae ϕ1(x, y)
and ϕ2(z, y), we have a new formula ψ(y) ≡ QHx, z (ϕ1(x, y), ϕ2(z, y)) such
that
A |= ψ(a) ⇔ |{b | A |= ϕ1(b, a)}| = |{b | A |= ϕ2(b, a)}|.
The quantiﬁer QH is known as the H¨artig, or equicardinality, quantiﬁer. Another
example is the Rescher quantiﬁer QR. The formation rule is the same
as for the H¨artig quantiﬁer, and
A |= QRx, z (ϕ1(x, a), ϕ2(z, a))
⇔ |{b | A |= ϕ1(b, a)}| > |{b | A |= ϕ2(b, a)}|.
What is common to these deﬁnitions? In all the cases, we construct sets of the
form ϕ(A, a) = {b ∈ A | A |= ϕ(b, a)} ⊆ A, and then make some cardinality
statements about those sets. This idea admits a nice generalization.
Deﬁnition 8.2 (Unary quantiﬁers). Let σu
k be a vocabulary of k unary relation
symbols U1, . . . , Uk, and let K ⊆ STRUCT[σu
k ] be a class of structures
closed under isomorphisms. Then QK is a unary quantiﬁer and FO(QK) extends
the set of formulae of FO with the following additional rule:
if ψ1(x1, y1), . . . , ψk(xk, yk) are formulae,
then QKx1 . . . xk(ψ1(x1, y1), . . . , ψk(xk, yk)) is a formula.
(8.1)
Here QK binds xi in the ith formula, for each i = 1, . . . , k. A free occurrence
of a variable y in ψi(xi, yi) remains free in this new formula unless y = xi.
The semantics of QK is deﬁned as follows:
A |= QKx1 . . . xk(ψ1(x1, a1), . . . , ψk(xk, ak))
⇔ A, ψ1(A, a1), . . . , ψk(A, ak) ∈ K.
(8.2)
In this deﬁnition, ai is a tuple of parameters that gives the interpretation for
those free variables of ψi(xi, yi) which are not equal to xi.
If Q is a set of unary quantiﬁers, then FO(Q) is the extension of FO with
the formation rule above for each QK ∈ Q.
The quantiﬁer rank of formulae with unary quantiﬁers is deﬁned by the
additional rule:
qr(QKx1, . . . , xk(ψ1(x1, y1), . . . , ψk(xk, yk)))
= max{qr(ψi(xi, yi)) | i ≤ k} + 1.
(8.3)
The three examples seen earlier are all unary quantiﬁers: for Qeven, the
class K consists of structures A, U such that |U | is even; for QH, it consists
of structures A, U1, U2 with |U1 |=|U2 |, and for QR, it consists of structures
A, U1, U2 with | U1 |>| U2 |. Note that the usual quantiﬁers ∃ and ∀ are
8.2 An Inﬁnitary Counting Logic 145
examples of unary quantiﬁers too: the classes of structures corresponding to
them consist of A, U with U = ∅ and U = A, respectively.
We shall see that the two ways of adding counting power to a logic – by
means of counting quantiﬁers, or unary quantiﬁers – are essentially equivalent
in their expressiveness. Formulae with counting quantiﬁers tend to be easier
to understand, but the logic becomes two-sorted. Unary quantiﬁers, on the
other hand, let us keep the logic one-sorted, but then a new quantiﬁer has to
be introduced for each counting property we wish to express.
8.2 An Inﬁnitary Counting Logic
The goal of this section is to introduce a very powerful counting logic: so
powerful, in fact, that it can express arbitrary properties of cardinalities, even
nonrecursive ones. Yet we shall see that this logic cannot address another
limitation of FO, namely, expressing iterative computations. We shall later
see another logic that expresses very powerful forms of iteration, and yet is
unable to count. Both of these logics are based on the idea of expanding FO
with inﬁnitary connectives.
Deﬁnition 8.3 (Inﬁnitary connectives and L∞ω). The logic L∞ω is deﬁned
as an extension of FO with inﬁnitary connectives and : if ϕi’s are
formulae, for i ∈ I, where I is not necessarily ﬁnite, and the free variables of
all the ϕi’s are among x, then
i∈I
ϕi and
i∈I
ϕi
are formulae. Their free variables are those variables in x that occur freely in
one of the ϕ’s.
The semantics is deﬁned as follows: A |=
i∈I
ϕi(a) if for some i ∈ I, it is
the case that A |= ϕi(a), and A |=
i∈I
ϕ(a) if A |= ϕi(a) for all i ∈ I.
This logic per se is too powerful to be of interest in ﬁnite model theory, in
view of the following.
Proposition 8.4. Let C be a class of ﬁnite structures closed under isomorphism.
Then there is an L∞ω sentence ΦC such that A ∈ C iﬀ A |= ΦC.
Proof. Recall that by Lemma 3.4, for every ﬁnite B, there is a sentence ΦB
such that A |= ΦB iﬀ A ∼= B. Hence we take ΦC to be
B∈C
ΦB.
Clearly, A |= ΦC iﬀ A ∈ C.
146 8 Logics with Counting
However, we can make logics with inﬁnitary connectives useful by putting
some restrictions on them. Our goal now is to deﬁne a two-sorted counting
logic L∗
∞ω(Cnt). We do it in two stages: ﬁrst, we extend L∞ω with some
counting features, and second, we impose restrictions that make the logic
suitable in the ﬁnite model theory context.
The structures for this logic are two-sorted, but the second sort is no longer
interpreted as an initial segment of the natural numbers: now it is the whole
set N. Furthermore, there is a constant symbol for each k ∈ N (which we also
denote by k). Hence, a structure is of the form
{a1, . . . , an}, N, (RA
i ), {k}k∈N , (8.4)
where again {a1, . . . , an}, (RA
i ) is a ﬁnite σ-structure, and Ri’s range over
symbols in σ.
We now deﬁne L∞ω(Cnt), an extremely powerful two-sorted logic, that
extends inﬁnitary logic L∞ω. Its structures are two-sorted structures (8.4),
and the logic extends L∞ω by the following rules:
• Each variable or constant of the second sort is a term of the second sort.
• If ϕ is a formula and x is a tuple of free ﬁrst-sort variables in ϕ, then
#x.ϕ is a term of the second sort, and its free variables are those in ϕ
except x.
The interpretation of this term is the number of tuples a over the
ﬁnite ﬁrst-sort universe that satisfy ϕ. That is, given a structure A with
the ﬁrst-sort universe A, a formula ϕ(x, y, ı) and the interpretations b and
ı0 for y and ı, respectively, the value of the term #x.ϕ(x, b, ı0) is
|{a ∈ A|x|
| A |= ϕ(a, b, ı0)}| .
• Counting quantiﬁers ∃ixϕ, with the same semantics as before, except that
i could be an arbitrary natural number.
The logic L∞ω(Cnt) is enormously powerful: it can deﬁne not only every
property of ﬁnite models (since it contains L∞ω), but also every predicate or
function on N. That is, P ⊆ Nk
is deﬁnable by
ϕP (i1, . . . , ik) =
(n1,...,nk)∈P
(i1 = n1) ∧ . . . ∧ (ik = nk) . (8.5)
Note that the deﬁnition is also redundant: for example, ∃ix ϕ can be
replaced by #x.ϕ ≥ i. However, we need counting quantiﬁers separately, as
will become clear soon.
Next, we restrict the logic by deﬁning the rank of a formula, rk(ϕ). Its
deﬁnition is similar to that of quantiﬁer rank, but there is one important
diﬀerence. In a two-sorted logic, we may have quantiﬁcation over two diﬀerent
universes. In the deﬁnition of the rank, we disregard quantiﬁcation over N.
Thus, rk(ϕ) and rk(t), where t is a term, are deﬁned inductively as follows:
8.2 An Inﬁnitary Counting Logic 147
• rk(t) = 0 if t is a variable, or a term k for k ∈ N.
• rk(ϕ) = 0 if ϕ is an atomic formula of vocabulary σ (i.e., an atomic ﬁrstsort
formula).
• rk(t1 = t2) = max{rk(t1), rk(t2)}, where t1 and t2 are terms.
• rk(¬ϕ) = rk(ϕ).
• rk(#x.ϕ) = rk(ϕ)+ |x|.
• rk( ϕj) = rk( ϕj) = supj rk(ϕj).
• rk(∀x ϕ) = rk(∃x ϕ) = rk(∃ix ϕ) = rk(ϕ) + 1.
• rk(∀i ϕ) = rk(∃i ϕ) = rk(ϕ).
Note that if ϕ is an FO formula, then rk(ϕ) = qr(ϕ).
Deﬁnition 8.5. L∗
∞ω(Cnt) is deﬁned as the restriction of L∞ω(Cnt) to formulae
and terms that have ﬁnite rank.
This logic is clearly closed under the Boolean connectives and both ﬁrstand
second-sort quantiﬁcation. It is not closed under inﬁnitary connectives:
for example, if Φi, i > 0, are L∗
∞ω(Cnt) sentences such that rk(Φi) = i, then
i Φi is not an L∗
∞ω(Cnt) sentence.
Note also that (8.5) implies that every subset of Nk
, k > 0, is deﬁnable
by an L∗
∞ω(Cnt) formula of rank 0. Thus, we assume that +, ·, −, ≤, and in
fact every predicate on natural numbers is available. To give an example, we
can express properties like: there is a node in the graph whose in-degree i and
out-degree j satisfy p2
i > pj where pi stands for the ith prime. This is done by
∃x∃i∃j (i = #y.E(y, x))∧(j = #y.E(x, y))∧P(i, j), where P is the predicate
on N for the property p2
i > pj.
Known expansions of FO with counting properties are contained in
L∗
∞ω(Cnt).
Proposition 8.6. For every FO, FO(Cnt), or FO(Q) formula, where Q is a
collection of unary quantiﬁers, there exists an equivalent L∗
∞ω(Cnt) formula
of the same rank.
Proof. The proof is trivial for FO and FO(Cnt). For FO(Q), assume we have
a formula
ψ(y1, . . . , yk) ≡ QKx1 . . . xk.(ψ1(x1, y1), . . . , ψk(xk, yk)), (8.6)
where K is a class of σu
k -structures A = A, U1, . . . , Uk closed under isomorphism.
Let Π be the set of all 2k
mapping π : {1, . . . , k} → {0, 1}, and for a
structure A ∈ K, let
π(A) =
i:π(i)=1
UA
i ∩
j:π(j)=0
(A − UA
j ) .
148 8 Logics with Counting
With each structure A, we then associate a tuple Π(A) = (π(A))π∈Π, with
π’s ordered lexicographically. Since K is a class of unary structures closed
under isomorphism, A ∈ K and Π(A) = Π(B) imply B ∈ K.
This provides a translation of (8.6) into L∗
∞ω(Cnt) as follows. Let
PK(n0, . . . , n2k−1) be the predicate on N that holds iﬀ (n0, . . . , n2k−1) is of
the form Π(A) for some A ∈ K. Then (8.6) translates into
PK #x.ψπ0 (x, y1, . . . , yk), . . . , #x.ψπ2k−1
(x, y1, . . . , yk) , (8.7)
where π0, . . . , π2k−1 is the enumeration of Π in the lexicographic ordering,
and
ψπ(x, y1, . . . , yk) =
i:π(i)=1
ψi(x, yi) ∧
j:π(j)=0
¬ψi(x, yi).
Thus, if b1, . . . , bk interpret y1, . . . , yk, respectively, in a structure B, then the
value of #x.ψπ(x, b1, . . . , bk) in B is precisely
π B, ψ1(B, b1), . . . , ψk(B, bk) .
Therefore, (8.7) holds for b1, . . . , bk in B iﬀ the σu
k -structure
B, ψ1(B, b1), . . . , ψk(B, bk) is in K. This proves the equivalence of
(8.6) and (8.7). Finally, since PK is a numerical predicate, it has rank 0, and
hence the rank of (8.7) is max{rk(ψ1), . . . , rk(ψk)} + 1 = rk(ψ), which proves
the proposition.
In general, L∗
∞ω(Cnt) can be viewed as an extremely powerful counting
logic: we can deﬁne arbitrary cardinalities of sets of tuples over a structure, and
on those, we can use arbitrary numerical predicates. Compared to L∗
∞ω(Cnt),
a logic such as FO(Cnt) restricts us in what sort of cardinalities we can
deﬁne (only those of sets given by formulae in one free variable), and what
operations we can use on those cardinalities (those deﬁnable with addition
and multiplication).
We now introduce what seems to be a drastic simpliﬁcation of L∗
∞ω(Cnt).
Deﬁnition 8.7. The logic L◦
∞ω(Cnt) is deﬁned as L∗
∞ω(Cnt) where counting
terms #x.ϕ and quantiﬁcation over N are not allowed.
On the surface, L◦
∞ω(Cnt) is a lot simpler than L∗
∞ω(Cnt), mainly because
counting terms for vectors, #x.ϕ, are very convenient for deﬁning complex
counting properties. But it turns out that the power of L◦
∞ω(Cnt) and
L∗
∞ω(Cnt) is identical.
Proposition 8.8. There is a translation ϕ → ϕ◦
of L∗
∞ω(Cnt) formulae into
L◦
∞ω(Cnt) formulae such that ϕ and ϕ◦
are equivalent and rk(ϕ) = rk(ϕ◦
).
8.2 An Inﬁnitary Counting Logic 149
Proof. It is easy to eliminate quantiﬁers over N without increasing the rank:
∃i ϕ(i, · · · ) and ∀i ϕ(i, · · · ) are equivalent to
k∈N
ϕ(k, · · · ) and
k∈N
ϕ(k, · · · ),
respectively. Thus, in the formulae below, we shall be using such quantiﬁers,
assuming that they are eliminated in the last step of the translation from
L∗
∞ω(Cnt) to L◦
∞ω(Cnt).
To eliminate counting terms, assume without loss of generality that every
occurrence of #x.ϕ is of the form #x.ϕ = #y.ψ or #x.ϕ = i, where i is
a variable or a constant (if #x.ϕ occurs inside an arithmetic predicate P,
we replace P by its explicit deﬁnition, using inﬁnitary connectives). Since
#x.ϕ = #y.ψ is equivalent to ∃i (#x.ϕ = i) ∧ (#y.ψ = i), whose rank is
the same as the rank of #x.ϕ = #y.ψ, and #x.ϕ = k, for a constant k, is
equivalent to ∃i (#x.ϕ = i ∧ i = k), we may assume that all occurrences of
#-terms are of the form #x.ϕ = i, where i is a second-sort variable.
The proof is now by induction on the formula. The only nontrivial case is
ψ(y, ) ≡ (#x.ϕ(x, y, ) = i). Throughout this proof, we assume that i is in .
By the hypothesis, there exists an L◦
∞ω(Cnt) formula ϕ◦
which is equivalent
to ϕ and has the same rank. We must now produce an L◦
∞ω(Cnt) formula
ψ◦
equivalent to ψ such that rk(ψ◦
) = rk(ϕ)+ | x |. The existence of such a
formula will follow from the lemma below.
Lemma 8.9. Let ϕ(x, y, ) be an L◦
∞ω(Cnt) formula. Then there exists an
L◦
∞ω(Cnt) formula γ(y, ) of rank rk(ϕ) + |x| such that γ is equivalent to
#x.ϕ = i.
Proof of the lemma is by induction on | x |. If x has a single component x,
γ(y, ) is deﬁned as
∃l (l = i) ∧ (∃!lx ϕ(x, y, )) ,
which has rank rk(ϕ) + 1. The quantiﬁer ∃l denotes an inﬁnite disjunction, as
explained earlier.
We next assume that x = zx0. By the hypothesis, there is an L◦
∞ω(Cnt)
formula α(x0, y, , l) equivalent to (l = #z.ϕ(z, x0, y, )) such that rk(α) =
rk(ϕ)+ |z|. We deﬁne
β(y, , k, l) ≡ ∃!kx0 α(x0, y, , l).
Then rk(β) = rk(α) + 1 = rk(ϕ)+ |x|. The formula β(y, , k, l) holds iﬀ there
exist exactly k elements x0 such that the number of vectors x with x0 in the
last position that satisfy ϕ(x, · · · ) is precisely l. Note that if β(y, , k, l) and
β(y, , k′
, l) hold, then k′
must equal k.
Thus, to check if #x.ϕ = i, one must check if
150 8 Logics with Counting
β(··· ,k,l) holds
(k · l) = i.
This is done as follows. Let γp(y, ) be deﬁned as:
∃i1 . . . ip∃j1 . . . jp









p
s=1
β(y, , is, js)
∧ ∀i, j β(y, , i, j) →
p
s=1
(i = is ∧ j = js)
∧
s=s′
(¬(is = is′ ) ∨ ¬(js = js′ ))
∧ i1 · j1 + . . . + ip · jp = i









That is, γp says that there are precisely p pairs (is, js) that satisfy β(y, , k, l),
and
p
s=1 is · js = i. When p = 0, we deﬁne γp(y, ) as (i = 0) ∧
∀i′
, j′
(¬β(y, , i′
, j′
)). We can see that rk(γp) = rk(β). We ﬁnally deﬁne
γ(y, ) ≡
p∈N
γp(y, ).
It follows that γ is an L◦
∞ω(Cnt) formula of rank that is equal to rk(β), and
hence to rk(ϕ)+ | x |, and that γ is equivalent to #x.ϕ = i. This completes
the proof of the lemma and the proposition.
We next consider L∗
∞ω(Cnt)+ <; that is, L∗
∞ω(Cnt) over ordered structures.
We shall see in the next section that, as for FO, there is a separation
L∗
∞ω(Cnt) (L∗
∞ω(Cnt)+ <)inv.
As the ﬁrst step, we show that L∗
∞ω(Cnt)+ < deﬁnes every property of ﬁnite
structures. Intuitively, with <, one can say that a given element of A is the
ﬁrst, second, etc., element of A. Then the unlimited counting power allows us
to code ﬁnite structures with numbers.
Proposition 8.10. Every property of ﬁnite ordered structures is deﬁnable in
L∗
∞ω(Cnt).
Proof. We show this for sentences in the language of graphs. Let C be a class
of ordered graphs. We assume without loss of generality that the set of nodes
of each such graph is a set of the form {0, . . . , n}. Then the membership in C
is tested by the following L∗
∞ω(Cnt) sentence of rank 3:
G∈C
∀x∀y E(x, y) ↔
(k,l)∈EG
k = #z.(z < x)
∧ l = #z.(z < y)
,
where EG
stands for the set of edges of G.
8.3 Games for L∗
∞ω(Cnt) 151
We ﬁnish this section by presenting a one-sorted version of L∗
∞ω(Cnt)
that has the same expressiveness. This logic is obtained by adding inﬁnitary
connectives and unary quantiﬁers to FO.
Let QAll be the collection of all unary quantiﬁers; that is, all quantiﬁers
QK where K ranges over all collections of unary structures closed under isomorphism.
We deﬁne a logic L∞ω(QAll) by extending L∞ω with the formation
rules (8.1) for each QK ∈ QAll, with the semantics given by (8.2), and quantiﬁer
rank deﬁned as in (8.3). We then deﬁne L∗
∞ω(QAll) as the restriction of
L∞ω(QAll) to formulae of ﬁnite quantiﬁer rank. This logic turns out to express
the same sentences as L∗
∞ω(Cnt). The proof of the proposition below is left
as an exercise for the reader.
Proposition 8.11. For every L∗
∞ω(Cnt) formula ϕ(x) without free secondsort
variables, there is an equivalent L∗
∞ω(QAll) formula ψ(x) such that
rk(ϕ) = qr(ψ), and conversely, for every L∗
∞ω(QAll) formula ψ(x), there is
an equivalent L∗
∞ω(Cnt) formula ϕ(x) with rk(ϕ) = qr(ψ).
8.3 Games for L∗
∞ω(Cnt)
We know that the expressive power of FO can be characterized via
Ehrenfeucht-Fra¨ıss´e games. Is there a similar game characterization for
L∗
∞ω(Cnt)? We give a positive answer to this question, by showing that bijective
games, introduced in Sect. 4.5, capture the expressiveness of L∗
∞ω(Cnt).
We ﬁrst review the deﬁnition of the game.
Deﬁnition 8.12 (Bijective games). A bijective Ehrenfeucht-Fra¨ıss´e game
is played by two players, the spoiler and the duplicator, on two structures
A, B ∈ STRUCT[σ]. If | A |=| B |, the spoiler wins the game. If | A |=| B |,
in each round i = 1, . . . , n, the duplicator selects a bijection fi : A → B, and
the spoiler selects a point ai ∈ A. The duplicator responds by bi = f(ai) ∈ B.
The duplicator wins the n-round game if the relation {(ai, bi) | 1 ≤ i ≤ n}
is a partial isomorphism between A and B. If the duplicator has a winning
strategy in the n-round bijective game on A and B, we write A ≡bij
n B.
Note that it is harder for the duplicator to win the bijective game. First, if
|A|=|B |, the duplicator immediately loses the game. Even if |A|=|B |, in each
round the duplicator must ﬁgure out what his response to each possible move
by the spoiler is, before the move is made, and there must be a one-to-one
correspondence between the spoiler’s moves and the duplicator’s responses.
In particular, any strategy where the same element b ∈ B could be used as a
response to several moves by the spoiler is disallowed.
Theorem 8.13. Given two structures A, B ∈ STRUCT[σ], and k ≥ 0, the
following are equivalent:
1. A ≡bij
k B;
152 8 Logics with Counting
2. A and B agree on all L∗
∞ω(Cnt) sentences of rank k.
Proof. Both implications 1 → 2 and 2 → 1 are proved by induction on k. We
start with the easier implication 1 → 2. By Proposition 8.8, assume that there
is no quantiﬁcation over the numerical domain, and that all quantiﬁers are of
the form ∃ix. For the base case k = 0, the proof is the same as in the case of
Ehrenfeucht-Fra¨ıss´e games.
We now assume that the implication holds for k, and we prove it for k +1.
Suppose A ≡bij
k+1 B. First consider a sentence of the form Φ ≡ ∃nxϕ(x) for
a constant n ∈ N. Suppose A |= Φ, and let c1, . . . , cn be distinct elements
of A such that A |= ϕ(ci), i = 1, . . . , n. Since A ≡bij
k+1 B, there is a bijection
f : A → B such that (A, a) ≡bij
k (B, f(a)) for all a ∈ A; in particular,
(A, ci) ≡bij
k (B, f(ci)) for all i ≤ n. By the hypothesis, (A, ci) and (B, f(ci))
agree on sentences of rank k; hence A |= ϕ(ci) implies B |= ϕ(f(ci)). Since f
is a bijection, all f(ci)’s are distinct, and thus B |= ∃nxϕ(x). The converse,
that B |= Φ implies A |= Φ, is proved in exactly the same way, using the
bijection f−1
.
Since every sentence of rank k + 1 can be obtained from sentences of the
form ∃nxϕ(x) by using the Boolean and inﬁnitary connectives, we see that
A |= Φ ⇔ B |= Φ for any rank k + 1 sentence Φ.
For the other direction, we use a proof similar to the proof of the
Ehrenfeucht-Fra¨ıss´e theorem given in Exercise 3.11. We want to deﬁne explicitly
formulae specifying rank-k types in L∗
∞ω(Cnt). The number of types can
be inﬁnite, but this is not a problem since we can use inﬁnitary connectives,
and rank-k types will be given by formulae of rank k.
We let ϕ0,m
i (x) be an enumeration of all the formulae that deﬁne distinct
atomic types of x with |x|= m; that is, all consistent conjunctions of the form
α1(x) ∧ . . . ∧ αM (x),
where αi(x) enumerate all (ﬁnitely many) atomic and negated atomic formulae
in x.
Next, inductively, let {ϕk+1,m
i (x) | i ∈ N} be an enumeration of all the
formulae of the form
∃!l1 y ϕk,m+1
i1
(x, y)∧. . .∧∃!lp y ϕk,m+1
ip
(x, y) ∧ ∀y
p
j=1
ϕk,m+1
ij
(x, y) , (8.8)
as p ranges over N and (l1, . . . , lp) ranges over p-tuples of positive integers.
Intuitively, each ϕk,m+1
ij
(x, y) deﬁnes the rank-k m + 1-type of a tuple (x, y).
Hence rank-k + 1 types of the form (8.8) say that a given x can be extended
to p diﬀerent rank-k types in such a way that for each ij, there are precisely lj
elements y such that ϕk,m+1
ij
(x, y) deﬁnes the ijth rank-k of the tuple (x, y).
Note that if the formula (8.8) is true in (A, a), then |A|= l1 + . . . + lp.
8.4 Counting and Locality 153
It follows immediately from the deﬁnition of formulae ϕk,m
i that for every
A, a ∈ Am
, and every k ≥ 0, there is exactly one ϕk,m
i such that A |= ϕk,m
i (a).
Next, we prove the following lemma by induction on k.
Lemma 8.14. For every m, every two structures A, B, and every a ∈ Am
, b ∈
Bm
, suppose there is a formula ϕk,m
i (x) such that A |= ϕk,m
i (a) and B |=
ϕk,m
i (b). Then (A, a) ≡bij
k (B, b).
Proof of the lemma. The case k = 0 is the same as in the proof of the
Ehrenfeucht-Fra¨ıss´e theorem. For the induction step, assume that the statement
holds for k, and let ϕk+1,m
i (x) be given by (8.8). If A |= ϕk+1,m
i (a)
and B |= ϕk+1,m
i (b), then both A and B have exactly l1 + . . . + lp elements.
Furthermore, for each j ≤ p, let Aj = {a ∈ A | A |= ϕk,m+1
ij
(aa)} and
Bj = {b ∈ B | B |= ϕk,m+1
ij
(bb)}. Then | Aj |=| Bj |= lj, and hence there
exists a bijection f : A → B that maps each Aj to Bj. For any a ∈ A, if j
is such that A |= ϕk,m+1
ij
(aa), then B |= ϕk,m+1
ij
(bf(a)), and hence by the induction
hypothesis, (A, aa) ≡bij
k (B, bf(a)). Thus, the bijection f proves that
(A, a) ≡bij
k+1 (B, b).
The implication 2 → 1 of Theorem 8.13 is now a special case of Lemma
8.14, since rk(ϕk,m
i ) = k.
8.4 Counting and Locality
Theorem 8.13 and Corollary 4.21 stating that (A, a) ⇆(3k−1)/2 (B, b) implies
(A, a) ≡bij
k (B, b), immediately give us the following result.
Theorem 8.15. Every L∗
∞ω(Cnt) formula ϕ(x) without free second-sort variables
is Hanf-local (and hence Gaifman-local, and has the BNDP).
Thus, despite its enormous counting power, L∗
∞ω(Cnt) remains local, and
cannot express properties such as graph connectivity. Combining Theorem
8.15 and Proposition 8.6, we obtain the following.
Corollary 8.16. If ϕ(x) is an FO(Cnt) formula without free second-sort
variables, or an FO(Q) formula, where Q is an arbitrary collection of unary
quantiﬁers, then ϕ(x) is Hanf-local (and hence Gaifman-local, and has the
BNDP).
Furthermore, we obtain the separation
L∗
∞ω(Cnt) (L∗
∞ω(Cnt)+ <)inv, (8.9)
since (L∗
∞ω(Cnt)+ <) expresses every property of ordered structures (including
nonlocal ones, such as graph connectivity), by Proposition 8.10.
154 8 Logics with Counting
Theorem 8.15 says nothing about formulae that may have free numerical
variables. Next, we show how to extend the notions of Hanf- and Gaifmanlocality
to such formulae.
Deﬁnition 8.17. An L∗
∞ω(Cnt) formula ϕ(x, ı) is Hanf-local if there exists
d ≥ 0 such that for all ı0 ∈ N|ı|
, any two structures A, B, and a ∈ A|x|
,
b ∈ B|x|
,
(A, a)⇆d(B, b) implies A |= ϕ(a, ı0) ⇔ B |= ϕ(b, ı0) .
Furthermore, ϕ(x, i) is Gaifman-local if there is d ≥ 0, such that for all
ı0 ∈ N|ı|
, every structure A, and a1, a2 ∈ A|x|
,
a1 ≈A
d a2 implies A |= ϕ(a1, ı0) ↔ ϕ(a2, ı0).
The locality rank lr(·) and the Hanf-locality rank hlr(·) are deﬁned as before:
these are the smallest d that witnesses Gaifman-locality (Hanf-locality,
respectively) of a formula.
In other words, the formula must be Hanf-local or Gaifman-local for any
instantiation of its free second-sort variables, with the locality rank being
uniformly bounded for all such instantiations.
A simple extension of Theorem 4.11 shows:
Proposition 8.18. If an L∗
∞ω(Cnt) formula ϕ(x, ı) is Hanf-local, then it is
Gaifman-local.
Furthermore, we can show Hanf-locality of all L∗
∞ω(Cnt) formulae (not
just those without free numerical variables) by using essentially the same
argument as in Theorem 4.12.
Theorem 8.19. Every L∗
∞ω(Cnt) formula ϕ(x, ı) is Hanf-local, and hence
Gaifman-local. Furthermore, hlr(ϕ) ≤ (3k
− 1)/2, and lr(ϕ) ≤ (3k+1
− 1)/2,
where k = rk(ϕ).
Proof. We give the proof for Hanf-locality; it is by induction on the structure
of the formulae. For atomic formulae and Boolean connectives, it is the same
as the proof of Theorem 4.12. For inﬁnitary connectives, the argument is the
same as for ∧ and ∨. By Proposition 8.8, the only remaining case is that
of counting quantiﬁers: ϕ(x, ı) ≡ ∃jy ψ(y, x, ı). We assume j is in ı. Let
rk(ψ) = k, so that rk(ϕ) = k + 1. Let d = hlr(ψ). It suﬃces to show that
hlr(ϕ) ≤ 3d + 1.
Fix an interpretation ı0 for ı (and j0 for j). Assume (A, a)⇆3d+1(B, b). By
Corollary 4.10, there is a bijection f : A → B such that (A, ac) ⇆d (B, bf(c))
for every c ∈ A. Assume A |= ϕ(a, ı); then we can ﬁnd c1, . . . , cj0 such
that A |= ψ(cl, a, ı), l = 1, . . . , j0. Since hlr(ψ) = d, by the hypothesis,
(A, acl) ⇆d (B, bf(cl)) implies B |= ψ(f(cl), b, ı), l = 1, . . . , j0. Thus,
B |= ϕ(b, ı), since f is a bijection. The converse, that B |= ϕ(b, ı) implies
A |= ϕ(a, ı), is identical. This proves hlr(ϕ) ≤ 3d + 1.
8.5 Complexity of Counting Quantiﬁers 155
8.5 Complexity of Counting Quantiﬁers
In this section we revisit the logic FO(Cnt), and give a circuit model that
corresponds to it. This circuit model deﬁnes a complexity class that extends
AC0
; the class is called TC0
, where TC stands for threshold circuits. There
are diﬀerent ways of deﬁning the class TC0
; the one chosen here uses majority
circuits, which have special gates for the majority function.
Deﬁnition 8.20. Majority circuits are deﬁned as the usual Boolean circuits
except that they have additional majority gates. Such a gate has 2k inputs,
x1, . . . , xn, y1, . . . , yn, for k > 0. The output of the gate is 1 if
n
i=1
xi ≥
n
i=1
yi,
and 0 otherwise.
A circuit family C has one circuit Cn for each n, where n is the number
of inputs. The size, the depth, and the language accepted by C, are deﬁned in
exactly the same way as for Boolean circuits. The class nonuniform TC0
is
deﬁned as the class of languages (subsets of {0, 1}∗
) accepted by polynomialsize
constant-depth families of majority circuits.
We now extend FO(Cnt) to a logic FO(Cnt)All. This logic, in addition to
FO(Cnt), has the linear ordering < on the non-numerical universe, and, furthermore,
the restriction of every predicate P ⊆ Nk
to the numerical universe
{0, . . . , n − 1}; that is, P ∩ {0, . . . , n − 1}k
.
Theorem 8.21. The class of structures deﬁnable by an FO(Cnt)All sentence
is in nonuniform TC0
. Consequently, the data complexity of FO(Cnt)All is
nonuniform TC0
.
Proof. As in the proof of Theorem 6.4, we code formulae by circuits. We ﬁrst
note that if a linear order is available on the non-numerical universe A, there
is no need for the numerical universe {0, . . ., n − 1}, where n =| A |, since
we can interpret min, max, <, and the arithmetic operations directly on A,
associating the ith element of A in the ordering < with i ∈ N. Thus, counting
quantiﬁers will be assumed to be of the form ∃yxϕ(x, · · · ), stating that there
exist at least i elements x satisfying ϕ, where y is the ith element of A in the
ordering <.
Recall that for each structure A with |A| = n, its encoding enc(A)
starts with 0n
1 that represents the size of the universe. For each formula
ϕ(x1, . . . , xm), and each tuple b = (b1, . . . , bm) in Am
, we construct a circuit
Cn
ϕ(b)
with the input enc(A) which outputs 1 iﬀ A |= ϕ(b).
If ϕ(b) is an atomic formula of the form S(b), where S ∈ σ, then we simply
output the corresponding bit from enc(A). If ϕ is a numerical formula, we
156 8 Logics with Counting
output 1 or 0 depending on whether ϕ(b) is true. For Boolean connectives, we
simply use ∨, ∧ or ¬ gates. Thus, it remains to show how to handle the case
of counting quantiﬁers.
Let ϕ(x1, . . . , xm) ≡ ∃x1 y ψ(y, x). That is, there exist x1 elements y
satisfying ϕ (since structures are ordered, we associate an element x1 with its
ranking in the linear order).
Let b ∈ Am
be given, and let a0, . . . , an−1 enumerate all the elements of A.
Let Ci be the circuit Cn
ψ(ai,b)
. We then collect the n outputs of such circuits,
and for each of the ﬁrst n inputs (which are the ﬁrst n zeros of enc(A)), we
produce 1 for the ﬁrst a1 zeros, and 0 for the remaining n − a1 zeros. This
can easily be done with small constant-depth circuits. We then feed all the 2n
inputs to a majority gate as shown in Fig. 8.1.
C0 C1 Cn−1
MAJ
1 1 1 0 0
a1
. . .
. . . . . .
Fig. 8.1. Circuit for the proof of Theorem 8.21
It is clear from the construction that the family of circuits deﬁned this
way has a ﬁxed constant depth (in fact, linear in the size of the formula), and
polynomial size in terms of A . This completes the proof.
As with nonuniform AC0
, the nonuniform version of TC0
can deﬁne even
noncomputable problems, since every predicate on N is available. The uniform
version of TC0
is deﬁned as FO(Cnt)+ <: that is, FO(Cnt) with ordering
available on the non-numerical universe. Thus, we restrict ourselves
to addition and multiplication on natural numbers, and other functions and
predicates deﬁnable with them (e.g., the BIT predicate).
Uniform TC0
is a proper extension of uniform AC0
: for example, parity
is in TC0
but not in AC0
. It appears to be a rather modest extension: all we
add is a simple form of counting. In particular, TC0
is contained in Ptime,
and in fact even in DLog. Nevertheless, we still do not know if TC0
NP.
We know, however, that FO(Cnt) is subsumed by L∗
∞ω(Cnt), and that
L∗
∞ω(Cnt) is local – and hence it cannot express many Ptime problems such
as graph connectivity, acyclicity, etc. Would not this give us the desired separation?
Unfortunately, it would not, since we can only prove locality of FO(Cnt)
but not FO(Cnt)+ <. We have seen that for FO, its extension with order,
8.5 Complexity of Counting Quantiﬁers 157
that is, (FO+ <)inv, is local too. The same result, however, is not true for
FO(Cnt). We now show a counterexample to locality of (FO(Cnt)+<)inv.
Proposition 8.22. There exist queries expressible in (FO(Cnt)+<)inv which
are not Gaifman-local.
Proof. The vocabulary σ contains a binary relation E and a unary relation P.
We call a σ-structure good if three conditions are satisﬁed:
1. E has exactly one node of in-degree 0 and out-degree 1, exactly one node
of out-degree 0 and in-degree 1, and all other nodes have both in-degree
1 and out-degree 1.
That is, the relation E is a disjoint union of a chain
{(a0, a1), (a1, a2), . . . , (ak−1, ak)} and zero or more cycles.
2. P contains a0, does not contain ak, and with each a ∈ P, except a0, it
contains its predecessor in E (the unique node b such that (b, a) ∈ E).
Thus, P contains an initial segment of the successor part of E, and may
contain some of the cycles in E.
3. |P| ≤ log n, where n is the size of the universe of the structure.
We claim that there is an FO(Cnt) sentence Φgood that tests if a structure
A ∈ STRUCT[σ] is good. Clearly, conditions 1 and 2 can be veriﬁed by FO
sentences. For condition 3, it suﬃces to check that the predicate j ≤ log k is
deﬁnable. Since j ≤ log k iﬀ 2j
≤ k, and the predicate i = 2j
is deﬁnable even
in FO in the presence of addition and multiplication (see Sect. 6.4), we see
that all three conditions can be deﬁned in FO(Cnt).
We now consider the following binary query Q:
If A is good, return the transitive closure of E restricted to P.
The result will follow from two claims. First, Q is deﬁnable in FO(Cnt)+<.
Second, Q is not Gaifman-local. The latter is simple: assume, to the contrary,
that Q is Gaifman-local and let d = lr(Q). Let k = 4d + 5, and n = 2k
. Take
E to be a successor (chain) of length n, with P interpreted as its initial k
elements. Notice that this is a good structure. Then in P, we can ﬁnd two
elements a, b with isomorphic and disjoint d-neighborhoods. Hence, (a, b) ≈d
(b, a), but the transitive closure query would distinguish (a, b) from (b, a).
It remains to show that Q is expressible in FO(Cnt)+<. First, we assume,
without loss of generality, that in a given structure A, elements of P precede
elements of A − P in the ordering <. Indeed, if this is not true of <, we can
always deﬁne, in FO, a new ordering <1 which coincides with < on P and on
A − P, and, furthermore, a <1 b for all a ∈ P and b ∈ P.
Let S ⊆ P, with S = {s1, . . . , sm}. Let each sj be the ijth element in
the ordering <; that is, ∃!ijx (x ≤ sj) holds. Deﬁne aS as the pth element
of A in the ordering <, where BIT(p, i1), . . . , BIT(p, im) are all true, and for
158 8 Logics with Counting
every i ∈ {i1, . . . , im}, the value of BIT(p, i) is false. Since |P| ≤ log n, such
an element aS exists for every S ⊆ P. Moreover, since BIT is deﬁnable, there
is a deﬁnable (in FO(Cnt)) predicate Code(u, v) which is true iﬀ v is of the
form aS for a set S, and u ∈ S.
The query Q will now be deﬁnable by a formula ∃z ψ(x, y, z), where ψ
says that z codes the path from x to y. That is, it says the following:
• Code(x, z) and Code(y, z) hold.
• If x0 is the predecessor of x and y0 is the successor of y, then Code(x0, z)
and Code(y0, z) do not hold.
• For every other element u = x, y such that Code(u, z) holds, it is the case
that Code(u1, z) and Code(u2, z) hold, where u1 and u2 are the predecessor
and the successor of u.
• Code(a0, z) holds iﬀ a0 = x, and Code(ak, z) does not hold. Here a0 and
ak are the elements of in-degree and out-degree 0, respectively.
Clearly, all these conditions can be expressed in FO(Cnt).
Given the special form of E, one can easily verify that this deﬁnes the
transitive closure restricted to P.
As a corollary of Proposition 8.22, we get a separation
FO(Cnt) (FO(Cnt)+<)inv,
since all FO(Cnt)-expressible queries are Gaifman-local, by Corollary 8.16.
8.6 Aggregate Operators
Aggregate operators occur in most practical database query languages. They
allow one to apply functions for entire columns of relations. For example, if we
have a ternary relation R whose tuples are (d, e, s), where d is the department
name, e is the employee name, and s is his/her salary, a typical aggregate
query would ask for the total salary for each department. Such a query would
construct, for each department d, the set of all tuples {(e1, s1), . . . , (en, sn)}
such that (d, ei, si) ∈ R for i = 1, . . . , n, and then output (d,
n
i=1 si). We
view this as applying the aggregate function SUM to the multiset {s1, . . . , sn}
(it is a multiset since some of the si’s can be the same, but we have to sum
them all).
Logics with counting seen so far are not well suited for proving results
about languages with aggregations, as they cannot talk about entire columns
of relations. Nevertheless, we shall show here that aggregate operators can be
simulated in L∗
∞ω(Cnt), thereby giving us expressibility bounds for practical
database query languages.
We ﬁrst deﬁne the notion of an aggregate operator.
8.6 Aggregate Operators 159
Deﬁnition 8.23. An aggregate operator is a collection F =
{f0, f1, f2, . . . , fω} of functions, where each fn, 0 < n < ω, takes an
n-element multiset (bag) of natural numbers, and returns a number in N.
Furthermore, f0 and fω are constants; fω is the ﬁxed value associated to all
inﬁnite multisets.
For example, the aggregate SUM will be represented as FSUM =
{f0, f1, f2, . . . , fω}, where f0 = fω = 0, and
fn({a1, . . . , an}) = a1 + . . . + an.
Deﬁnition 8.24 (Aggregate logic). The aggregate logic Laggr is deﬁned
as the following extension of L∗
∞ω(Cnt).
For every possible aggregate operator F, a numerical term t(x, y) and a
formula ϕ(x, y), we have a new numerical term
t′
(x) = AggrF y t(x, y), ϕ(x, y) .
Variables y become bound in AggrF y t(x, y), ϕ(x, y) .
The value t′
(a) is calculated as follows. If there are inﬁnitely many b such
that ϕ(a, b) holds, then t′
(a) = fω. If there is no b such that ϕ(a, b) holds,
then t′
(a) = f0. Otherwise, let b1, . . . , bm enumerate all the b such that ϕ(a, b)
holds. Then
t′
(a) = fm({t(a, b1), . . . , t(a, bm)}).
Note that the argument of fm is in general a multiset, since some of t(a, bi)
may be the same. The rank of t′
is deﬁned as max(rk(t), rk(ϕ))+ |y|.
For example, the query that computes the total salary for each department
is given by the following Laggr formula ϕ(d, v):
∃e∃s R(d, e, s) ∧ v = AggrFSUM
(e, s) s, R(d, e, s) .
The above query assumes that some of the columns in a relation could
be numerical. The results below are proved without this assumption, but it
is easy to extend the proofs to relations with columns of diﬀerent types (see
Exercise 8.16).
It turns out that this seemingly powerful extension does not actually provide
any additional power.
Theorem 8.25. The expressive power of Laggr and L∗
∞ω(Cnt) is the same.
Proof. It suﬃces to show that for every formula ϕ(x) of Laggr, there exists an
equivalent formula ϕ◦
(x) of L∗
∞ω(Cnt) such that rk(ϕ◦
) ≤ rk(ϕ). We prove
this theorem by induction on the formulae and terms. We also produce, for
each second-sort term t(x) of Laggr, a formula ψt(x, z) of L∗
∞ω(Cnt), with z
of the second sort, such that A |= ψt(a, n) iﬀ the value of t(a) on A is n. Below
we show how to produce such formulae ψt.
160 8 Logics with Counting
For a second-sort term t which is a variable i, we deﬁne ψt(i, z) to be
(z = i). If t is a constant c, then ψt(z) ≡ (z = c).
For a term
t′
(x) = AggrF y t(x, y), ϕ(x, y) ,
ψt′ (x, z) is deﬁned as
ϕ◦
∞(x) ∧ (z = fω) ∨ ¬ϕ◦
∞(x) ∧ ψ′
(x, z) ,
where ϕ◦
∞(x) tests if the number of y satisfying ϕ(x, y) is inﬁnite, and ψ′
produces the value of the term in the case when the number of such y is
ﬁnite.
The formula ϕ◦
∞(x) can be deﬁned as
i:yi of 2nd sort C⊆N, C inﬁnite c∈C
ϕ◦
i (x, c)
where ϕ◦
i (x, yi) ≡ ∃y1, . . . , yi−1, yi+1, . . . , ym ϕ◦
(x, y).
The formula ψ′
(x, z) is deﬁned as the disjunction of ¬∃yϕ◦
(x, y)∧(z = f0)
and
c,(c1,n1),...,(cl,nl)








z = c
∧ ∃!n1y (ϕ◦
(x, y) ∧ ψt(x, y, c1))
∧ · · ·
∧ ∃!nly (ϕ◦
(x, y) ∧ ψt(x, y, cl))
∧ ∀y
a∈N
(ϕ◦
(x, y) ∧ ψt(x, y, a) →
l
i=1
(a = ci))








where the disjunction is taken over all tuples (c1, n1), . . . , (cl, nl), l > 0, ni > 0,
and values c ∈ N such that
F({c1, . . . , c1
n1 times
, . . . , cl, . . . , cl
nl times
}) = c.
Indeed, this formula asserts either that ϕ(x, ·) does not hold and then
z = f0, or that c1, . . . , cl are exactly the values of the term t(x, y) when
ϕ(x, y) holds, and that ni’s are the multiplicities of the ci’s.
A straightforward analysis of the produced formulae shows that rk(ψt′ ) ≤
max(rk(ϕ◦
), rk(ψt)) plus the number of ﬁrst-sort variables in y; that is,
rk(ψt′ ) ≤ rk(t′
). This completes the proof of the theorem.
Corollary 8.26. Every query expressible in Laggr is Hanf-local and Gaifman-
local.
Thus, practical database query languages with aggregate functions still
cannot express queries such as graph connectivity or transitive closure.
8.8 Exercises 161
8.7 Bibliographic Notes
Extension of FO with counting quantiﬁers was proposed by Immerman and
Lander [135]; the presentation here follows closely Etessami [68]. Generalized
quantiﬁers are used extensively in logic, see V¨a¨an¨anen [237, 238].
The inﬁnitary counting logic L∗
∞ω(Cnt) is from Libkin [166], although a
closely related logic with unary quantiﬁers was studied in Hella [121]. Proposition
8.8 is a standard technique for eliminating counting terms over tuples,
see, e.g., Kolaitis and V¨a¨an¨anen [149], and [166].
Bijective games were introduced by Hella [121], and the connection between
bijective games and L∗
∞ω(Cnt) is essentially from that paper (it used
a slightly diﬀerent logic though). Locality of L∗
∞ω(Cnt) is from [166].
Connection between FO(Cnt) and TC0
is from Barrington, Immerman,
and Straubing [16]. The name TC0
refers to threshold circuits that use threshold
gates: such a gate has a threshold i, and it outputs 1 if at least i of its inputs
are set to 1. The equivalence of threshold and majority gates is well known,
see, e.g., Vollmer [247]. Proposition 8.22 is from Hella, Libkin, and Nurmonen
[123]. Our treatment of aggregate operators follows Gr¨adel and Gurevich [98];
the deﬁnition of the aggregate logic and Theorem 8.25 are from Hella et al.
[124].
Sources for exercises:
Exercise 8.6: Libkin [166]
Exercises 8.7 and 8.8: Libkin [167]
Exercises 8.9 and 8.10: Libkin and Wong [170]
Exercise 8.11: Immerman and Lander [135]
Exercises 8.12 and 8.13: Barrington, Immerman, and Straubing [16]
Exercises 8.14 and 8.15: Nurmonen [189]
Exercise 8.16: Hella et al. [124]
8.8 Exercises
Exercise 8.1. Show that none of the following is expressible in L∗
∞ω(Cnt): transitive
closure of a graph, testing for planarity, acyclicity, 3-colorability.
Exercise 8.2. Prove Proposition 8.10 for arbitrary vocabularies.
Exercise 8.3. Prove Proposition 8.11.
Exercise 8.4. Prove Proposition 8.18.
Exercise 8.5. Prove Theorem 8.19 for Gaifman-locality.
Exercise 8.6. Extend Exercise 4.11 to counting logics. That is, deﬁne functions
Hanf rankL, Gaifman rankL : N → N, for a logic L, as follows:
Hanf rankL(n) = max{hlr(ϕ) | ϕ ∈ L, rk(ϕ) = n},
162 8 Logics with Counting
Gaifman rankL(n) = max{lr(ϕ) | ϕ ∈ L, rk(ϕ) = n}.
Assume that we deal with purely relational vocabularies. Prove that for every n > 1,
Hanf rankL(n) = 2n−1
− 1 and Gaifman rankL(n) = 2n
− 1, when L is one of the
following: FO(Cnt), FO(Q) for any Q, L∗
∞ω(Cnt).
Exercise 8.7. Extend L∗
∞ω(Cnt) by additional atomic formulae ιd(x, y) (where
|x|=|y|), such that A |= ιd(a, b) iﬀ a ≈A
d b. Let L∗,r
∞ω(Cnt) be the resulting logic,
where every occurrence of ιd satisﬁes d ≤ r. Prove that L∗,r
∞ω(Cnt) is Hanf-local.
Exercise 8.8. Extend L∗
∞ω(Cnt) by adding local second-order quantiﬁcation: that
is, second-order quantiﬁcation restricted to Nd(a), where a is the interpretation of
free ﬁrst-order variables. Such an extension, like the one of Exercise 8.7, must have
the radii of neighborhoods, over which local second-order quantiﬁcation is done,
uniformly bounded in inﬁnitary formulae.
Complete the deﬁnition of this logic, and prove that it captures precisely all the
Hanf-local queries.
Exercise 8.9. Let k
be the class of preorders in which every equivalence class
has size at most k. The equivalence associated with a preorder is
x ∼ y ⇔ (x y) ∧ (y x).
Prove that graph connectivity is not in (L∗
∞ω(Cnt)+ k
)inv.
Exercise 8.10. The goal of this exercise is to prove a statement much stronger than
that of Exercise 8.9. Given a preorder , let [x] be the equivalence class of x with
respect to ∼. Let g : N → N be a nondecreasing function which is not bounded by
a constant. Let g be the class of preorders such that on an n-element set, for
at most g(n) elements we have |[x]| = 2, and for the remaining at least n − g(n)
elements, |[x]| = 1; furthermore, if |[x]| = 2 and |[y]| = 1, then x ≺ y. In other words,
such preorders are linear orders everywhere, except at most g(n) initial elements.
Prove the following:
1. There are functions g for which (L∗
∞ω(Cnt)+ g)inv contains nonlocal queries.
2. For every g, every query in (L∗
∞ω(Cnt)+ g)inv has the BNDP.
Exercise 8.11. Deﬁne Ehrenfeucht-Fra¨ıss´e games for FO(Cnt), and prove their
correctness.
Exercise 8.12. Consider the logic FO(MAJ) deﬁned as follows. A universe of σstructure
is ordered, and is thus associated with {0, . . . , n − 1}. Furthermore, for
each k > 0, and a formula ϕ(x, z), with |x|= k, we have a new formula
ψ(z) ≡ MAJ x ϕ(x, z),
binding x, such that A |= ψ(c) iﬀ |ϕ(A, c)| ≥ 1
2
· |A|k
. Recall that ϕ(A, c) stands for
{b | A |= ϕ(b, c)}.
Prove the following:
• Over ordered structures, the logics FO(MAJ) and FO(Cnt) express all the same
queries.
8.8 Exercises 163
• In the deﬁnition of FO(MAJ), it suﬃces to consider k ≤ 2: that is, the majority
quantiﬁer MAJ (x1, x2) ϕ(x1, x2, z).
• Over ordered structures with the BIT predicate, the fragment of FO(MAJ) in
which k = 1 (i.e., only new formulae of the form MAJ x ϕ(x, z) are allowed) is
as expressive as FO(Cnt).
Exercise 8.13. Prove the converse of Theorem 8.21: that is, any class of structures
in nonuniform TC0
is deﬁnable in FO(Cnt)All.
Exercise 8.14. Consider the generalized quantiﬁer Dn deﬁned as follows. If ϕ(x, z)
is a formula, then ψ(z) ≡ Dnx ϕ(x, z) is a formula, such that A |= ϕ(a) iﬀ |ϕ(A, a)|
mod n = 0.
Next, consider strings over the alphabet {0, 1} as ﬁnite structure (see Chap. 7),
and prove that none of the following properties of strings s0 . . . sm−1 is expressible
in FO(Dn):
• Majority:
Pm−1
i=0 si ≥ m
2
;
• m mod p = 0, for every prime p that does not divide n;
•
`Pm−1
i=0 si
´
mod p = 0, again for every prime p that does not divide n.
Exercise 8.15. Consider the generalized quantiﬁer Dn from Exercise 8.14. Consider
ordered structures (in which we can associate elements with numbers), and deﬁne
an additional predicate y = nx over them. Prove that even in the presence of such
an additional predicate, FO(Dn) cannot express the predicate y = (n + 1)x.
Exercise 8.16. Aggregate operators in database query languages normally operate
on rational numbers; for example, one of the standard aggregates is AVG =
{f0, f1, f2, . . . , fω}, where f0 = fω = 0, and fn({a1, . . . , an}) = (a1 + . . . + an)/n.
Deﬁne LQ
aggr as an extension of Laggr where the numerical domain is Q, each
q ∈ Q is a numerical term, and all aggregate operators F on Q are available.
Prove the following:
1. For every LQ
aggr formula ϕ(x) without free numerical variables, there exists an
equivalent L∗
∞ω(Cnt) formula of the same rank.
2. Conclude that LQ
aggr is Hanf-local and Gaifman-local.
Next, extend all the results to the case when diﬀerent columns of σ-relations
could be of diﬀerent types: some of the universe of the ﬁrst sort, and some numerical.
Exercise 8.17.∗
Prove that transitive closure is not expressible in FO(Cnt)+<.
9
Turing Machines and Finite Models
In this chapter we introduce the technique of coding Turing machines in various
logics. It is precisely this technique that gave rise to numerous applications
of ﬁnite model theory in computational complexity. We start by proving the
earliest such result, Trakhtenbrot’s theorem, stating that ﬁnite satisﬁability
is not decidable. For the proof of Trakhtenbrot’s theorem, we code Turing
machines with no inputs. By a reﬁnement of this technique, we code nondeterministic
polynomial time Turing machines in existential second-order logic
(∃SO), proving Fagin’s theorem stating that ∃SO-deﬁnable properties of ﬁnite
structures are precisely those whose complexity is NP.
9.1 Trakhtenbrot’s Theorem and Failure of Completeness
Recall the completeness theorem for FO: a sentence Φ is valid (is true in all
models) iﬀ it is provable in some formal system. In particular, this implies
that the set of all valid FO sentences is r.e. (recursively enumerable), since
one can enumerate all the formal proofs of valid FO sentences. We now show
that completeness fails over ﬁnite models.
What does it mean that Φ is valid? It means that all structures A, ﬁnite
or inﬁnite, are models of Φ: that is, A |= Φ. Since we are interested in ﬁnite
models only, we want to reﬁne the notions of satisﬁability and validity in the
ﬁnite context.
Deﬁnition 9.1. Given a vocabulary σ, a sentence Φ in that vocabulary is
called ﬁnitely satisﬁable if there is a ﬁnite structure A ∈ STRUCT[σ] such
that A |= Φ.
The sentence Φ is called ﬁnitely valid if A |= Φ holds for all ﬁnite structures
A ∈ STRUCT[σ].
Theorem 9.2 (Trakhtenbrot). For every relational vocabulary σ with at
least one binary relation symbol, it is undecidable whether a sentence Φ of
vocabulary σ is ﬁnitely satisﬁable.
166 9 Turing Machines and Finite Models
In the proof that we give, the vocabulary σ contains several binary relation
symbols and a constant symbol. But it is easy to modify it to prove the result
with just one binary relation symbol (this is done by coding several relations
into one; see Exercise 9.1).
Before we prove Trakhtenbrot’s theorem, we point out two corollaries.
First, as we mentioned earlier, completeness fails in the ﬁnite.
Corollary 9.3. For any vocabulary containing at least one binary relation
symbol, the set of ﬁnitely valid sentences is not recursively enumerable.
Proof. Notice that the set of ﬁnitely satisﬁable sentences is recursively enumerable:
one simply enumerates all pairs (A, Φ), where A is ﬁnite, and outputs
Φ whenever A |= Φ. Assume that the set of ﬁnitely valid sentences is r.e. Since
¬Φ is ﬁnitely valid iﬀ Φ is not ﬁnitely satisﬁable, we conclude that the set
of sentences which are not ﬁnitely satisﬁable is r.e., too. However, if both a
set X and its complement ¯X are r.e., then X is recursive; hence, we conclude
that the set of ﬁnitely satisﬁable sentences is recursive, which contradicts
Trakhtenbrot’s theorem.
Another corollary states that one cannot have an analog of the L¨owenheimSkolem
theorem for ﬁnite models.
Corollary 9.4. There is no recursive function f such that if Φ has a ﬁnite
model, then it has a model of size at most f(Φ).
Indeed, with such a recursive function one would be able to decide ﬁnite
satisﬁability.
We now prove Trakhtenbrot’s theorem. The idea of the proof is to code
Turing machines in FO: for every Turing machine M, we construct a sentence
ΦM of vocabulary σ such that ΦM is ﬁnitely satisﬁable iﬀ M halts on the
empty input. The latter is well known to be undecidable (this is an easy
exercise in computability theory).
Let M = (Q, Σ, ∆, δ, q0, Qa, Qr) be a deterministic Turing machine with
a one-way inﬁnite tape. Here Q is the set of states, Σ is the input alphabet,
∆ is the tape alphabet, q0 is the initial state, Qa (Qr) is the set of accepting
(rejecting) states, from which there are no transitions, and δ is the transition
function. Since we are coding the problem of halting on the empty input, we
can assume without loss of generality that ∆ = {0, 1}, with 0 playing the role
of the blank symbol.
We deﬁne σ so that its structures represent computations of M. More
precisely,
σ = {<, min, T0(·, ·), T1(·, ·), (Hq(·, ·))q∈Q},
where
9.1 Trakhtenbrot’s Theorem and Failure of Completeness 167
• < is a linear order and min is a constant symbol for the minimal element
with respect to <; hence the ﬁnite universe will be associated with an
initial segment of natural numbers.
• T0 and T1 are tape predicates; Ti(p, t) indicates that position p at time t
contains i, for i = 0, 1.
• Hq’s are head predicates; Hq(p, t) indicates that at time t, the machine is
in state q, and its head is in position p.
The sentence ΦM states that <, min, Ti’s, and Hq’s are interpreted as indicated
above, and that the machine eventually halts. Note that if the machine
halts, then Hq(p, t) holds for some p, t, and q ∈ Qa ∪ Qr, and after that the
conﬁguration of the machine does not change. That is, all the conﬁgurations
of the halting computation can be represented by a ﬁnite σ-structure.
We deﬁne ΦM to be the conjunction of the following sentences:
• A sentence stating that < is a linear order and min is its minimal element.
• A sentence deﬁning the initial conﬁguration of M (it is in state q0, the
head is in the ﬁrst position, and the tape contains only zeros):
Hq0 (min, min) ∧ ∀p T0(p, min).
• A sentence stating that in every conﬁguration of M, each cell of the tape
contains exactly one element of ∆:
∀p∀t T0(p, t) ↔ ¬T1(p, t) .
• A sentence imposing the basic consistency conditions on the predicates
Hq’s (at any time the machine is in exactly one state):
∀t∃!p
q∈Q
Hq(p, t) ∧ ¬∃p∃t
q,q′∈Q, q=q′
Hq(p, t) ∧ Hq′ (p, t) .
• A set of sentences stating that Ti’s and Hq’s respect the transitions of
M (with one sentence per transition). For example, assume that δ(q, 0) =
(q′
, 1, ℓ); that is, if M is in state q reading 0, then it writes 1, moves the
head one position to the left and changes the state to q′
. This transition
is represented by the conjunction of
∀p∀t


p = min
∧ T0(p, t)
∧ Hq(p, t)

 →





T1(p, t + 1)
∧ Hq′ (p − 1, t + 1)
∧ ∀p′
p = p′
→
(
i=0,1
Ti(p′
, t + 1) ↔ Ti(p′
, t))





and
168 9 Turing Machines and Finite Models
∀p∀t


p = min
∧ T0(p, t)
∧ Hq(p, t)

 →





T1(p, t + 1)
∧ Hq′ (p, t + 1)
∧ ∀p′
p = p′
→
(
i=0,1
Ti(p′
, t + 1) ↔ Ti(p′
, t))





.
We use abbreviations p − 1 and t + 1 for the predecessor of p and the
successor of t in the ordering <; these are, of course, FO-deﬁnable. The
ﬁrst sentence above ensures that the tape content in position p changes
from 0 to 1, the state changes from q to q′
, the rest of the tape remains
the same, and the head moves to position p−1, assuming p is not the ﬁrst
position on the tape. The second sentence is very similar, and handles the
case when p is the initial position: then the head does not move and stays
in p.
• Finally, a sentence stating that at some point, M is in a halting state:
∃p∃t
q∈Qa∪Qr
Hq(p, t).
If ΦM has a ﬁnite model, then such a model represents a computation
of M that starts with the tape containing all zeros, and ends in a halting
state. If, on the other hand, M halts on the empty input, then the set of all
conﬁgurations of the halting computations of M coded as relations <, Ti’s,
and Hq’s, is a model of ΦM (necessarily ﬁnite). Thus, M halts on the empty
input iﬀ ΦM has a ﬁnite model. Since testing if M halts on the empty model
is undecidable, then so is ﬁnite satisﬁability for ΦM .
9.2 Fagin’s Theorem and NP
Fagin’s theorem provides a purely logical characterization of the complexity
class NP, by means of coding computations of nondeterministic polynomial
time Turing machines in a fragment of second-order logic. Before stating the
result, we give the following general deﬁnition. Recall that by properties, we
mean Boolean queries, namely, collections of structures closed under isomor-
phism.
Deﬁnition 9.5. Let K be a complexity class, L a logic, and C a class of ﬁnite
structures. We say that L captures K on C if the following hold:
1. The data complexity of L on C is K; that is, for every L-sentence Φ, testing
if A |= Φ is in K, provided A ∈ C.
2. For every property P of structures from C that can be tested with complexity
K, there is a sentence ΦP of L such that A |= ΦP iﬀ A has the
property P, for every A ∈ C.
9.2 Fagin’s Theorem and NP 169
If C is the class of all ﬁnite structures, we say that L captures K.
Theorem 9.6 (Fagin). ∃SO captures NP.
Before proving this theorem, we make several comments and point out
some corollaries. Fagin’s theorem is a very signiﬁcant result as it was the
ﬁrst machine-independent characterization of a complexity class. Normally, we
deﬁne complexity classes in terms of resources (time, space) that computations
can use; here we use a purely logical formalism. Following Fagin’s theorem,
logical characterizations have been proven for many complexity classes (we
already saw them for uniform AC0
and TC0
, and later we shall see how to
characterize NLog, Ptime, and Pspace over ordered structures).
The hardest open problems in complexity theory concern separation of
complexity classes, with the “Ptime vs. NP” question being undoubtedly the
most famous such problem. Logical characterizations of complexity classes
show that such separation results can be formulated as inexpressibility results
in logic. Suppose that we have two complexity classes K1 and K2, captured
by logics L1 and L2. To prove that K1 = K2, it would then suﬃce to separate
the logics L1 and L2; that is, to show that some problem deﬁnable in L2 is
inexpressible in L1, or vice versa.
Since the class coNP consists of the problems whose complements are in
NP, and the negation of an ∃SO sentence is an ∀SO sentence, we obtain:
Corollary 9.7. ∀SO captures coNP.
Hence, to show that NP = coNP, it would suﬃce to exhibit a property
deﬁnable in ∀SO but not deﬁnable in ∃SO. While we still do not know if
such a property exists, recall that we have a property deﬁnable in ∀MSO
but not deﬁnable in ∃MSO: graph connectivity. In fact, for reasons obvious
from Fagin’s theorem and Corollary 9.7, ∃MSO is sometimes referred to as
“monadic NP”, and ∀MSO as “monadic coNP”. Hence, Proposition 7.14 tells
us that
monadic NP = monadic coNP.
Note that separating ∀SO from ∃SO would also resolve the “Ptime vs. NP”
question:
∀SO = ∃SO ⇒ NP = coNP ⇒ Ptime = NP
(if Ptime and NP were the same, NP would be closed under the complement,
and hence NP and coNP would be the same).
As another remark, we point out that the above remark concerning the
separation of ∃SO and ∀SO is speciﬁc to the ﬁnite case. Indeed, by Fagin’s
theorem, ∃SO = ∀SO over ﬁnite structures iﬀ NP = coNP, but over some
inﬁnite structures (e.g., N, +, · ), the logics ∃SO and ∀SO are known to be
diﬀerent.
170 9 Turing Machines and Finite Models
We now prove Fagin’s theorem. First, we show that every ∃SO sentence Φ
can be evaluated in NP. Suppose Φ is ∃S1 . . . ∃Sn ϕ, where ϕ is FO. Given
A, the nondeterministic machine ﬁrst guesses S1, . . . , Sn, and then checks if
ϕ(S1, . . . , Sn) holds. The latter can be done in polynomial time in A plus
the size of S1, . . . , Sn, and thus in polynomial time in A (see Proposition
6.6). Hence, Φ can be evaluated in NP.
Next, we show that every NP property of ﬁnite structures can be expressed
in ∃SO. The proof of this direction is very close to the proof of Trakhtenbrot’s
theorem, but there are two additional elements we have to take care of: time
bounds, and the input.
Suppose we are given a property P of σ-structures that can be tested,
on encodings of σ-structures, by a nondeterministic polynomial time Turing
machine M = (Q, Σ, ∆, δ, q0, Qa, Qr) with a one-way inﬁnite tape. Here Q =
{q0, . . . , qm−1} is the set of states, and we assume without loss of generality
that Σ = {0, 1} and ∆ extends Σ with the blank symbol “ ”. We assume that
M runs in time nk
. Notice that n is the size of the encoding, so we always
assume n > 1. We can also assume without loss of generality that M always
visits the entire input; that is, that nk
always exceeds the size of the encodings
of n-element structures (this is possible because the size of enc(A), deﬁned in
Chap. 6, is polynomial in A ).
The sentence describing acceptance by M on encodings of structures from
STRUCT[σ] will be of the form
∃L ∃T0∃T1∃T2 ∃Hq0 . . . ∃Hqm−1 Ψ, (9.1)
where Ψ is a sentence of vocabulary σ ∪ {L, T0, T1, T2} ∪ {Hq | q ∈ Q}. Here
L is binary, and other symbols are of arity 2k. The intended interpretation of
these relational symbols is as follows:
• L is a linear order on the universe.
With L, one can deﬁne, in FO, the lexicographic linear order ≤k on ktuples.
Since M runs in time nk
and visits at most nk
cells, we can model
both positions on the tape (p) and time (t) by k-tuples of the elements of the
universe.
With this, the predicates Ti’s and Hq’s are deﬁned similarly to the proof
of Trakhtenbrot’s theorem:
• T0, T1, and T2 are tape predicates; Ti(p, t) indicates that position p at
time t contains i, for i = 0, 1, and T2(p, t) says that p at time t contains
the blank symbol.
• Hq’s are head predicates; Hq(p, t) indicates that at time t, the machine is
in state q, and its head is in position p.
9.2 Fagin’s Theorem and NP 171
The sentence Ψ must now assert that when M starts on the encoding of A,
the predicates Ti’s and Hq’s correspond to its computation, and eventually M
reaches an accepting state. Note that the encoding of A depends on a linear
ordering of the universe of A. We may assume, without loss of generality,
that this ordering is L. Indeed, since queries are closed under isomorphism,
choosing one particular ordering to be used in the representation of enc(A)
does not aﬀect the result.
We now deﬁne Ψ as the conjunction of the following sentences:
• The sentence stating that L deﬁnes a linear ordering.
• The sentence stating that
– in every conﬁguration of M, each cell of the tape contains exactly one
element of ∆;
– at any time the machine is in exactly one state;
– eventually, M enters a state from Qa.
All these are expressed in exactly the same way as in the proof of Trakhtenbrot’s
theorem.
• Sentences stating that Ti’s and Hq’s respect the transitions of M. These
are written almost as in the proof of Trakhtenbrot’s theorem, but one has
to take into account nondeterminism. For every a ∈ ∆ and q ∈ Q, we have
a sentence
(q′,b,move)∈δ(q,a)
α(q,a,q′,b,move),
where move ∈ {ℓ, r} and α(q,a,q′,b,move) is the sentence describing the
transition in which, upon reading a in state q, the machine writes b, makes
the move move, and enters state q′
. Such a sentence is written in exactly
the same way as in the proof of Trakhtenbrot’s theorem.
• The sentence deﬁning the initial conﬁguration of M. Suppose we have
formulae ι(p) and ξ(p) of vocabulary σ∪{L} such that A |= ι(p) iﬀ the pth
position of enc(A) is 1 (in the standard encoding of structures presented
in Chap. 6), and A |= ξ(p) iﬀ p exceeds the length of enc(A). Note that
we need L in these formulae since the encoding refers to a linear order on
the universe. With such formulae, we deﬁne the initial conﬁguration by
∀p ∀t ¬∃u (u <k t) →
ι(p) ↔ T1(t, p)
∧ ξ(p) ↔ T2(t, p)
.
In other words, at time 0, the tape contains the encoding of the structure
followed by blanks.
Just as in the proof of Trakhtenbrot’s theorem, we conclude that (9.1)
holds in A iﬀ M accepts enc(A). It thus remains to show how to deﬁne the
formulae ι(p) and ξ(p).
172 9 Turing Machines and Finite Models
We illustrate this for the case of σ = {E}, with E binary (to keep the notation
simple; extension to arbitrary vocabularies is straightforward). Assume
that the universe of the graph is {0, . . . , n − 1}, where (i, j) ∈ L iﬀ i < j. The
graph is then encoded by the string 0n
1 · s, where s is a string of length n2
,
such that it has 1 in position u · n + v, for 0 ≤ u, v ≤ n − 1, iﬀ (u, v) ∈ E.
That is, the actual encoding of E starts in position (n + 1). Already from
here, one can see that in the presence of addition and multiplication (given
as ternary relations), ι is deﬁnable. Indeed, p = (p1, . . . , pk) represents the
position p1 · nk−1
+ p2 · nk−2
+ . . . + pk−1 · n + pk. Hence, ι(p) is equivalent to
the disjunction of k
i=1 pi · nk−i
= n and
∃u ≤ (n−1)∃v ≤ (n−1) (n + 1) + u · n + v =
k
i=1
pi · nk−i
∧ E(u, v) .
With addition and multiplication, this is a deﬁnable property, and addition
and multiplication themselves can be introduced by means of additional existential
second-order quantiﬁers (since one can state in FO that a given relation
properly represents addition or multiplication with respect to the ordering L).
While this is certainly enough to conclude that ι is deﬁnable, we now
sketch a proof of deﬁnability of ι without any additional arithmetic. Instead,
we shall only refer to the linear ordering L, and we shall use the associated
successor relation (i.e., we shall refer to x + 1 or x − 1). Assume k = 3. That
is, a tuple p represents the position p1n2
+ p2n + p3 on the tape. The ﬁrst
position where the encoding of E starts is n + 1 (positions 0 to n represent
the size of the universe) and the last one is n2
+ n. Hence, if p1 > 1, then
ι is false. Assume p1 = 0. Then we are talking about the position p2n + p3.
Positions 0 to n−1 have zeros, so if p2 = 0, then again ι is false. If p3 = 0, then
(p2 −1)n+(p3 −1)+(n+1) = p2n+p3, and hence the position corresponds to
E(p2 − 1, p3 − 1). If p3 = 0, then this position corresponds to E(p2 − 2, n − 1).
Hence, the formula ι(p1, p2, p3) is of the form
(p1 = 0)
∧(p2 > 1)
∧
(p3 = 0) ∧ E(p2 − 1, p3 − 1)
∨ (p3 = 0) ∧ E(p2 − 2, n − 1)
∨ (p1 = 0) ∧ (p2 = 1) ∧ (p3 = 0) ∨ (p1 = 1) ∧ . . . ,
where for the case of p1 = 1 a similar case analysis is done. Clearly, with the
linear order L, both 0 and n − 1, and the predecessor function are deﬁnable,
and hence ι is FO. (The details of writing down ι for arbitrary k are left as
an exercise to the reader, see Exercise 9.4.) The formula ξ(p) simply says that
p, considered as a number, exceeds n2
+ n + 1. This completes the proof of
Fagin’s theorem.
We now show several more corollaries of Fagin’s theorem. The ﬁrst one is
Cook’s theorem stating that SAT, propositional satisﬁability, is NP-complete.
9.2 Fagin’s Theorem and NP 173
Corollary 9.8 (Cook). SAT is NP-complete.
Proof. Let P be a problem (a class of σ-structures) in NP. By Fagin’s theorem,
there is an ∃SO sentence Φ ≡ ∃S1 . . . ∃Sn ϕ such that A is in P iﬀ A |= Φ.
Let X = {Si(a) | i = 1, . . . , n, a ∈ Aarity(Si)
}. We construct a propositional
formula αA
ϕ with variables from X such that A |= Φ iﬀ αA
ϕ is satisﬁable.
The formula αA
ϕ is obtained from ϕ by the following three transformations:
• replacing each ∃x ψ(x, ·) by a∈A ψ(a, ·);
• replacing each ∀x ψ(x, ·) by a∈A ψ(a, ·); and
• replacing each R(a), for R ∈ σ, by its truth value in A.
In the resulting formula, the variables are of the form Si(a); that is, they come
from the set X. Clearly, A |= Φ iﬀ αA
ϕ is satisﬁable, and αA
ϕ can be constructed
by a deterministic logarithmic space machine. This proves NP-completeness
of SAT.
The logics ∃SO and ∀SO characterize NP and coNP, the ﬁrst level of the
polynomial hierarchy PH. Recall that the levels of PH are deﬁned inductively:
Σp
1 = NP, and Σp
k+1 = NPΣp
k . The level Πp
k is deﬁned as the set of complements
of problems from Σp
k. Also recall that Σ1
k is the class of SO sentences
of the form
(∃ . . . ∃)(∀ . . . ∀)(∃ . . . ∃) . . . ϕ,
with k quantiﬁer blocks, and Π1
k is deﬁned likewise but the ﬁrst block of
quantiﬁers is universal.
We now sketch an inductive argument showing that Σ1
k captures Σp
k, for
every k. The base case is Fagin’s theorem. Now consider a problem in Σp
k+1.
By Fagin’s theorem, there is an ∃SO sentence Φ (corresponding to the NP machine)
with additional predicates expressing Σp
k properties. We know, by the
hypothesis, that those properties are deﬁnable by Σ1
k formulae. Then pushing
the second-order quantiﬁer outwards, we convert Φ into a Σ1
k+1 sentence. The
extra quantiﬁer alternation arises when these predicates for Σp
k properties are
negated: suppose we have a formula ∃ . . . ∃ϕ(P), where P is expressed by a
formula ∃ . . . ∃ψ, with ψ being FO, and P may occur negatively. Then putting
the resulting formula in the prenex form, we have a second-order quantiﬁer
preﬁx of the form (∃ . . . ∃)(∀ . . . ∀). For example, ∃ . . . ∃ ¬ ∃ . . . ∃ψ is equivalent
to ∃ . . . ∃∀ . . . ∀ ¬ψ. Filling all the details of this inductive proof is left to
the reader as an exercise (Exercise 9.5).
Thus, we have the the following result.
Corollary 9.9. For each k ≥ 1,
• Σ1
k captures Σp
k, and
• Π1
k captures Πp
k .
In particular, SO captures the polynomial hierarchy.
174 9 Turing Machines and Finite Models
9.3 Bibliographic Notes
Trakhtenbrot’s theorem, one of the earliest results in ﬁnite model theory, was
published in 1950 [234].
Fagin’s theorem was published in 1974 [70, 71]. His motivation came from
the complementation problem for spectra. The spectrum of a sentence Φ is the
set {n ∈ N | Φ has a ﬁnite model of size n}. The complementation problem
(Asser [14]) asks whether spectra are closed under complement; that is, where
the complement of the spectrum of Φ is the spectrum of some sentence Ψ.
If σ = {R1, . . . , Rn} is the vocabulary of Φ, then the spectrum of Φ can
be alternatively viewed as ﬁnite models (of the empty vocabulary) of the
∃SO sentence ∃R1 . . . ∃Rn Φ (by associating a universe of size n with n).
Fagin deﬁned generalized spectra as ﬁnite models of ∃SO sentences (i.e., the
vocabulary no longer needs to be empty). The complementation problem for
generalized spectra is then the problem whether NP equals coNP.
The result that ∃SO and ∀SO are diﬀerent on N, +, · is due to Kleene
[146]. In fact, over N, +, · , the intersection of ∃SO and ∀SO collapses to FO,
while over ﬁnite structures it properly contains FO.
Cook’s theorem is from [39] (and is presented in many texts of complexity
and computability, e.g. [126, 195]).
The polynomial hierarchy and its connection with SO are from Stockmeyer
[223].
Sources for exercises:
Exercises 9.6 and 9.7: Gr¨adel [97]
Exercise 9.8: Jones and Selman [140]
Exercise 9.9: Lautemann, Schwentick, and Th´erien [162]
Exercise 9.10: Eiter, Gottlob, and Gurevich [63]
Exercise 9.11: Gottlob, Kolaitis, and Schwentick [95]
Exercise 9.12: Makowsky and Pnueli [178]
Exercise 9.13: (a) from Fagin [72]
(b) from Ajtai [10]
(see also Fagin [74])
9.4 Exercises
Exercise 9.1. Prove Trakhtenbrot’s theorem for an arbitrary vocabulary with at
least one binary relation symbol.
Hint: use the binary relation symbol to code several binary relations, used in our
proof of Trakhtenbrot’s theorem.
Exercise 9.2. Prove that Trakhtenbrot’s theorem fails for unary vocabularies: that
is, if all the symbols in σ are unary, then ﬁnite satisﬁability is decidable.
Exercise 9.3. Use Trakhtenbrot’s theorem to prove that order invariance for FO
queries is undecidable.
9.4 Exercises 175
Exercise 9.4. Give a general deﬁnition of the formula ι from the proof of Fagin’s
theorem (i.e., for arbitrary σ and k).
Exercise 9.5. Complete the proof of Corollary 9.9.
Exercise 9.6. Show that there is an encoding schema for ﬁnite σ-structures such
that the formulae ι from the proof of Fagin’s theorem can be assumed to be
quantiﬁer-free, if the successor relation and the minimal and maximal element with
respect to it can be used in formulae.
Exercise 9.7. Use the encoding scheme of Exercise 9.6 to prove that every NP
can be deﬁned by an ∃SO sentence whose ﬁrst-order part is universal (i.e., of the
form ∀ . . . ∀ ψ, where ψ is quantiﬁer-free), under the assumption that we consider
structures with explicitly given order and successor relations, as well as constants
for the minimal and the maximal elements.
Prove that without these assumptions, universal ﬁrst-order quantiﬁcation in ∃SO
formulae is not suﬃcient to capture all of NP. What kind of quantiﬁer preﬁxes does
one need in the general case?
Exercise 9.8. Prove that a set X ⊆ N is a spectrum iﬀ it is in NEXPTIME.
Explain why this does not contradict Fagin’s theorem.
Exercise 9.9. Consider the vocabulary σΣ = (<, (Pa)a∈Σ) used in Chap. 7 for
coding strings as ﬁnite structures. Recall that a sentence Φ over such vocabulary
deﬁnes a language (a subset of Σ∗
) given by {s ∈ Σ∗
| Ms |= Φ}.
Consider a restriction ∃SOmatch of ∃SO in which existential second-order variables
range over matchings: that is, binary relations of the form {(xi, yi) | i ≤ k}
where all xi’s and yi’s are distinct.
Prove that a language is deﬁnable in ∃SOmatch iﬀ it is context-free.
Exercise 9.10. Let S be a set of quantiﬁer preﬁxes, and let ∃SO(S) be the fragment
of ∃SO which consists of sentences of the form ∃R1 . . . ∃Rn ϕ, where ϕ is a prenex
formula whose quantiﬁer preﬁx is in S. We call ∃SO(S) regular if over strings it only
deﬁnes regular languages.
Prove the following:
• ∃SO(∀∗
∃∀∗
) is regular;
• ∃SO(∃∗
∀∀) is regular;
• if ∃SO(S) is regular, then it is contained in the union of ∃SO(∀∗
∃∀∗
) and
∃SO(∃∗
∀∀);
• if ∃SO(S) is not regular, then it deﬁnes some NP-complete language.
Exercise 9.11. We now consider ∃SO(S) and ∃MSO(S) over directed graphs. Prove
the following:
• ∃SO(∃∗
∀) only deﬁnes polynomial time properties of graphs;
• ∃SO(∀∀) and ∃MSO(∃∗
∀∀) in which at most one second-order quantiﬁer is used
only deﬁne polynomial time properties of graphs;
• each of the following deﬁnes some NP-complete problems on graphs:
– ∃SO(∃∀∀), where only one second-order quantiﬁer over binary relations is
used;
176 9 Turing Machines and Finite Models
– ∃MSO(∀∃) and ∃MSO(∀∀∀), where only one second-order quantiﬁer is used;
– ∃MSO(∀∀), where only two second-order quantiﬁers are used.
Exercise 9.12. Deﬁne SO(k, m) as the union of Σ1
k and Π1
k where all quantiﬁcation
is over relations of arity at most m. That is, SO(k, m) is the restriction of SO to at
most k − 1 alternations of quantiﬁers, and quantiﬁcation is over relations of arity
m. This is usually referred to as the alternation-arity hierarchy.
Prove that the alternation-arity hierarchy is strict: that is, there is a constant c
such that
SO(k, m) SO(k + c, m + c)
for all k, m.
Exercise 9.13. Deﬁne ∃SO(m) as the restriction of class of ∃SO to second-order
quantiﬁcation over relations of arity at most m. Prove the following:
(a) If ∃SO(m) = ∃SO(m + 1), then ∃SO(k) = ∃SO(m) for every k ≥ m.
(b) If σ contains an m-ary relation symbol P, then the class of structures in which
P has an even number of tuples is not ∃SO(m − 1)-deﬁnable.
(c) Conclude from (a) and (b) that, if σ contains an m-ary relation symbol P, then
∃SO(i) ∃SO(j) over σ-structures, for every 1 ≤ i < j ≤ m.
Exercise 9.14.∗
Now consider just the arity hierarchy for SO: that is, SO(m) is
deﬁned as
S
k∈N SO(k, m). Is the arity hierarchy strict?
Exercise 9.15.∗
We call a sentence categorical if it has at most one model of
each ﬁnite cardinality. Is it true that every spectrum is a spectrum of a categorical
sentence?
10
Fixed Point Logics and Complexity Classes
Most logics we have seen so far are not well suited for expressing many
tractable graph properties, such as graph connectivity, reachability, and so
on. The limited expressiveness of FO and counting logics is due to the fact
that they lack mechanisms for expressing ﬁxed point computations. Other
logics we have seen, such as MSO, ∃SO, and ∀SO, can express intractable
graph properties.
Consider, for example, the transitive closure query. Given a binary relation
R, we can express relations R0
, R1
, R2
, R3
, . . ., where Ri
contains pairs (a, b)
such that there is a path from a to b of length at most i. To compute the
transitive closure of R, we need the union of all those relations: that is,
R∞
=
∞
i=0
Ri
.
How could one compute such a union? Since relation R is ﬁnite, starting with
some n, the sequence Ri
, i ≥ 0, stabilizes: Rn
= Rn+1
= Rn+2
= . . .. Indeed,
in this case n can be taken to be the number of elements of relation R. Hence,
R∞
= Rn
; that is, Rn
is the limit of the sequence Ri
, i > 0. But we can also
view Rn
as a ﬁxed point of an operator that sends each Ri
to Ri+1
.
In this chapter we study logics extended with operators for computing
ﬁxed points of various operators. We start by presenting the basics of ﬁxed
point theory (in a rather simpliﬁed way, adapted for ﬁnite structures). We
then deﬁne various extensions of FO with ﬁxed point operators, study their
expressiveness, and show that on ordered structures these extensions capture
complexity classes Ptime and Pspace. Finally, we show how to extend FO
with an operator for computing just the transitive closure, and prove that this
extension captures NLog on ordered structures.
178 10 Fixed Point Logics and Complexity Classes
10.1 Fixed Points of Operators on Sets
Typically the theory of ﬁxed point operators is presented for complete lattices:
that is, partially ordered sets U, ≺ where every – ﬁnite or inﬁnite – subset
of U has a greatest lower bound and a least upper bound in the ordering ≺.
However, here we deal only with ﬁnite sets, which somewhat simpliﬁes the
presentation.
Given a set U, let ℘(U) be its powerset. An operator on U is a mapping
F : ℘(U) → ℘(U). We say that an operator F is monotone if
X ⊆ Y implies F(X) ⊆ F(Y ),
and inﬂationary if
X ⊆ F(X)
for all X ∈ ℘(U).
Deﬁnition 10.1. Given an operator F : ℘(U) → ℘(U), a set X ⊆ U is a
ﬁxed point of F if F(X) = X. A set X ⊆ A is a least ﬁxed point of F if it
is a ﬁxed point, and for every other ﬁxed point Y of F we have X ⊆ Y . The
least ﬁxed point of F will be denoted by lfp(F).
Let us now consider the following sequence:
X0
= ∅, Xi+1
= F(Xi
). (10.1)
We call F inductive if the sequence (10.1) is increasing: Xi
⊆ Xi+1
for all i.
Every monotone operator F is inductive, which is shown by a simple induction.
Of course X0
⊆ X1
since X0
= ∅. If Xi
⊆ X+1
, then, by monotonicity,
F(Xi
) ⊆ F(Xi+1
); that is, Xi+1
⊆ Xi+2
. This shows that Xi
⊆ Xi+1
for all
i ∈ N.
If F is inductive, we deﬁne
X∞
=
∞
i=0
Xi
. (10.2)
Since U is assumed to be ﬁnite, the sequence (10.1) actually stabilizes after
some ﬁnite number of steps, so there is a number n such that X∞
= Xn
.
To give an example, let R be a binary relation on a ﬁnite set A, and
let F : ℘(A2
) → ℘(A2
) be the operator deﬁned by F(X) = R ∪ (R ◦ X).
Here ◦ is the relational composition: R ◦ X = {(a, b) | (a, c) ∈ R, (c, b) ∈
X, for some c ∈ A}. Notice that this operator is monotone: if X ⊆ Y , then
R ◦ X ⊆ R ◦ Y . Let us now deﬁne the sequence Xi
, i ≥ 0, as in (10.1). First,
X0
= ∅. Since R ◦ ∅ = ∅, we have X1
= R. Then X2
= R ∪ (R ◦ R) = R ∪ R2
;
that is, the set of pairs (a, b) such that there is a path of length at most 2 from
a to b. Continuing, we see that Xi
= R ∪ . . . ∪ Ri
, the set of pairs connected
10.1 Fixed Points of Operators on Sets 179
by paths of length at most i. This sequence reaches a ﬁxed point X∞
, which
is the transitive closure of R.
We now prove that every monotone operator has a least ﬁxed point, which
is the set X∞
(10.2), deﬁned as the union of the increasing sequence (10.1).
Theorem 10.2 (Tarski-Knaster). Every monotone operator F : ℘(U) →
℘(U) has a least ﬁxed point lfp(F) which can be deﬁned as
lfp(F) = {Y | Y = F(Y )}.
Furthermore, lfp(F) = X∞
= i Xi
, for the sequence Xi
deﬁned by (10.1).
Proof. Let W = {Y | F(Y ) ⊆ Y }. Clearly, W = ∅, since U ∈ W. We ﬁrst
show that S = W is a ﬁxed point of F. Indeed, for every Y ∈ W, we have
S ⊆ Y and hence F(S) ⊆ F(Y ) ⊆ Y ; therefore, F(S) ⊆ W = S. On the
other hand, since F(S) ⊆ S, we have F(F(S)) ⊆ F(S), and thus F(S) ∈ W.
Hence, S = W ⊆ F(S), which proves S = F(S).
Let W′
= {Y | F(Y ) = Y } and S′
= W′
. Then S ∈ W′
and hence
S′
⊆ S; on the other hand, W′
⊆ W, so S = W ⊆ W′
= S′
. Hence,
S = S′
. Thus, S = {Y | Y = F(Y )} is a ﬁxed point of F. Since it is the
intersection of all the ﬁxed points of F, it is the least ﬁxed point of F. This
shows that
lfp(F) = {Y | Y = F(Y )} = {Y | F(Y ) ⊆ Y }.
To prove that lfp(F) = X∞
, note that the sequence Xi
increases, and
hence for some n ∈ N, Xn
= Xn+1
= . . . = X∞
. Thus, F(X∞
) = X∞
and
X∞
is a ﬁxed point. To show that it is the least ﬁxed point, it suﬃces to prove
that Xi
⊆ Y for every i and every Y ∈ W. We prove this by induction on i.
Clearly X0
⊆ Y for all Y ∈ W. Suppose we need to prove the statement for
Xi+1
. Let Y ∈ W. We have Xi+1
= F(Xi
). By the hypothesis, Xi
⊆ Y , and
by monotonicity, F(Xi
) ⊆ F(Y ) ⊆ Y . Hence, Xi+1
⊆ Y . This shows that all
the Xi
’s are contained in all the sets of W, and completes the proof of the
theorem.
Not all the operators of interest are monotone. We now present two diﬀerent
constructions by means of which the ﬁxed point of non-monotone operators
can be deﬁned.
Suppose F is inﬂationary: that is, Y ⊆ F(Y ) for all Y . Then F is inductive;
that is, the sequence (10.1) is increasing, and hence it reaches a ﬁxed
point X∞
. Now suppose G is an arbitrary operator. With G, we associate an
inﬂationary operator Ginﬂ deﬁned by Ginﬂ(Y ) = Y ∪G(Y ). Then X∞
for Ginﬂ
is called the inﬂationary ﬁxed point of G and is denoted by ifp(G). In other
words, ifp(G) is the union of all sets Xi
where X0
= ∅ and Xi+1
= Xi
∪G(Xi
).
Finally, we consider an arbitrary operator F : ℘(U) → ℘(U) and the
sequence (10.1). This sequence need not be inductive, so there are two possibilities.
The ﬁrst is that this sequence reaches a ﬁxed point; that is, for some
180 10 Fixed Point Logics and Complexity Classes
n ∈ N we have Xn
= Xn+1
, and thus for all m > n, Xm
= Xn
. If there is
such an n, it must be the case that n ≤ 2|U|
, since there are only 2|U|
subsets
of U. The second possibility is that no such n exists.
We now deﬁne the partial ﬁxed point of F as
pfp(F) =
Xn
if Xn
= Xn+1
∅ if Xn
= Xn+1
for all n ≤ 2|U|
.
The deﬁnition is unambiguous: since Xn
= Xn+1
implies that the sequence
(10.1) stabilizes, then Xn
= Xn+1
and Xm
= Xm+1
imply that Xn
= Xm
.
We leave the following as an easy exercise to the reader.
Proposition 10.3. If F is monotone, then lfp(F) = ifp(F) = pfp(F).
10.2 Fixed Point Logics
We now show how to add ﬁxed point operators to FO. Suppose we have a
relational vocabulary σ, and an additional relation symbol R ∈ σ of arity k.
Let ϕ(R, x1, . . . , xk) be a formula of vocabulary σ ∪ {R}. We put the symbol
R explicitly as a parameter, since this formula will give rise to an operator on
σ-structures.
For each A ∈ STRUCT[σ], the formula ϕ(R, x) gives rise to an operator
Fϕ : ℘(Ak
) → ℘(Ak
) deﬁned as follows:
Fϕ(X) = {a | A |= ϕ(X/R, a)}. (10.3)
Here the notation ϕ(X/R, a) means that R is interpreted as X in ϕ; more
precisely, if A′
is a (σ ∪{R})-structure expanding A, in which R is interpreted
as X, then A′
|= ϕ(a).
The idea of ﬁxed point logics is that we add formulae for computing ﬁxed
points of operators Fϕ. This already gives us formal deﬁnitions of logics IFP
and PFP.
Deﬁnition 10.4. The logics IFP and PFP are deﬁned as extensions of FO
with the following formation rules:
• (For IFP): if ϕ(R, x) is a formula, where R is k-ary, and t is a tuple of
terms, where |x|=|t|= k, then
[ifpR,xϕ(R, x)](t)
is a formula, whose free variables are those of t.
• (For PFP): if ϕ(R, x) is a formula, where R is k-ary, and t is a tuple of
terms, where |x|=|t|= k, then
[pfpR,xϕ(R, x)](t)
is a formula, whose free variables are those of t.
10.2 Fixed Point Logics 181
The semantics is deﬁned as follows:
• (For IFP): A |= [ifpR,xϕ(R, x)](a) iﬀ a ∈ ifp(Fϕ).
• (For PFP): A |= [pfpR,xϕ(R, x)](a) iﬀ a ∈ pfp(Fϕ).
Why could we not deﬁne an extension with the least ﬁxed point in exactly
the same way? The reason is that least ﬁxed points are guaranteed to exist
only for monotone operators. However, monotonicity is not an easy property
to deal with.
Lemma 10.5. Testing if Fϕ is monotone is undecidable for FO formulae ϕ.
Proof. Let Φ be an arbitrary sentence, and ϕ(S, x) ≡ (S(x) → Φ). Suppose Φ is
valid. Then ϕ(S, x) is always true and hence Fϕ is monotone in every structure.
Suppose now that A |= ¬Φ for some nonempty structure A. Then, over A,
ϕ(S, x) is equivalent to ¬S(x), and hence Fϕ is not monotone. Therefore, Fϕ
is monotone iﬀ Φ is true in every nonempty structure, which is undecidable,
by Trakhtenbrot’s theorem.
Thus, to ensure that least ﬁxed points are only taken for monotone operators,
we impose some syntactic restrictions. Given a formula ϕ that may
contain a relation symbol R, we say that an occurrence of R is negative if
it is under the scope of an odd number of negations, and is positive, if it is
under the scope of an even number of negations. For example, in the formula
∃x¬R(x)∨¬∀y∀z¬(R(y)∧¬R(z)), the ﬁrst occurrence of R (i.e., R(x)) is negative,
the second (R(y)) is positive (as it is under the scope of two negations),
and the last one (R(z)) is negative again. We say that a formula is positive
in R if there are no negative occurrences of R in it; in other words, either all
occurrences of R are positive, or there are none at all.
Deﬁnition 10.6. The logic LFP extends FO with the following formation
rule:
• if ϕ(R, x) is a formula positive in R, where R is k-ary and t is a tuple of
terms, where |x|=|t|= k, then
[lfpR,xϕ(R, x)](t)
is a formula, whose free variables are those of t.
The semantics is deﬁned as follows:
A |= [lfpR,xϕ(R, x)](a) iﬀ a ∈ lfp(Fϕ).
Of course, there is something to be proven here:
182 10 Fixed Point Logics and Complexity Classes
Lemma 10.7. If ϕ(R, x) is positive in R, then Fϕ is monotone.
The proof is by an easy induction on the structure of the formula (which
includes the cases of Boolean connectives, quantiﬁers, and lfp operators) and
is left as an exercise to the reader.
We now give a few examples of queries deﬁnable in ﬁxed point logics.
Transitive Closure and Acyclicity
Let E be a binary relation, and let ϕ(R, x, y) be
E(x, y) ∨ ∃z (E(x, z) ∧ R(z, y)).
Clearly, this is positive in R. Let ψ(u, v) be [lfpR,x,yϕ(R, x, y)](u, v). What
does this formula deﬁne?
To answer this, we must consider the operator Fϕ. For a set X, we have
Fϕ(X) = E ∪(E ◦X). We have seen this operator in the previous section, and
know that its least ﬁxed point is the transitive closure of E. Hence, ψ(u, v)
deﬁnes the transitive closure of E. This also implies that graph connectivity
is LFP-deﬁnable by the sentence ∀u∀v ψ(u, v).
As the next example, we again consider graphs whose edge relation is E,
and the formula α(S, x) given by
∀y E(y, x) → S(y) .
This formula is again positive in S. The operator Fα associated with this
formula takes a set X and returns the set of all nodes a such that all the
nodes b from which there is an edge to a are in X. Let us now iterate this
operator. Clearly, Fα(∅) is the set of nodes of in-degree 0. Then Fα(Fα(∅)) is
the set of nodes a such that all nodes b with edges (b, a) ∈ E have in-degree
0. Reformulating this, we can state that Fα(Fα(∅)) is the set of nodes a such
that all paths ending in a have length at most 1. Following this, at the ith
stage of the iteration we get the set of nodes a such that all the paths ending
in a have length at most i. When we reach the ﬁxed point, we have nodes
such that all the paths ending in them are ﬁnite. Hence, the formula
∀u [lfpS,xα(S, x)](u)
tests if a graph is acyclic.
Arithmetic on Successor Structures
As a third example, consider structures of vocabulary (min, succ), where succ
is interpreted as a successor relation on the universe, and min is the minimal
element with respect to succ. That is, the structures will be of the form
{0, . . . , n − 1}, 0, {(i, i + 1) | i + 1 ≤ n − 1} . We show how to deﬁne
+++ = {(i, j, k) | i + j = k} and ××× = {(i, j, k) | i · j = k}
10.2 Fixed Point Logics 183
on such structures. For +++, we use the recursive deﬁnition:
x + 0 = x
x + (y + 1) = (x + y) + 1.
Let R be ternary and β+(R, x, y, z) be
y = min ∧ z = x ∨ ∃u∃v R(x, u, v) ∧ succ(u, y) ∧ succ(v, z) .
Intuitively, it states the conditions for (x, y, z) to be in the graph of addition:
either y = 0 and x = z, or, if we already know that x + u = v, and y =
u + 1, z = v + 1, then we can infer x + y = z. This formula is positive in R,
and the least ﬁxed point computes the graph of addition:
ϕ+++(x, y, z) = [lfpR,x,y,zβ+(R, x, y, z)](x, y, z).
Using addition, we can deﬁne multiplication:
x · 0 = 0
x · (y + 1) = x · y + x.
Similarly to the case of addition, we deﬁne β×(S, x, y, z) as
y = min ∧ z = min ∨ ∃u∃v S(x, u, v) ∧ succ(u, y) ∧ ϕ+++(x, v, z) .
This formula is positive in S. Then
ϕ×××(x, y, z) = [lfpS,x,y,zβ×(S, x, y, z)](x, y, z)
deﬁnes the graph of multiplication. Since it uses ϕ+++ as a subformula, this
gives us an example of nested least ﬁxed point operators.
Combining this example with Theorem 6.12, we conclude that BIT is LFPdeﬁnable
over successor structures.
A Game on Graphs
Consider the following game played on a graph G = V, E with a distinguished
start node a. There are two players: player I and player II. At each
round i, ﬁrst player I selects a node bi and then player II selects a node ci,
such that (a, b1), as well as (bi, ci) and (ci, bi+1), are edges in E, for all i. The
player who cannot make a legal move loses the game.
Let S be unary, and deﬁne α(S, x) as
∀y E(x, y) → ∃z E(y, z) ∧ S(z) .
What is Fα(∅)? It is the set of nodes b of out-degree 0; that is, nodes in which
player II wins, since player I does not have a single move. In general, Fα(X)
is the set of nodes b such that no matter where player I moves from b, player
184 10 Fixed Point Logics and Complexity Classes
II will have a response from X. Thus, iterating Fα, we see that the ith stage
consists of nodes from which player II has a winning strategy in at most i − 1
rounds. Hence,
[lfpS,xα(S, x)](a)
holds iﬀ player II has a winning strategy from node a.
We conclude this section by a remark concerning free variables in ﬁxed
point formulae. So far, in the deﬁnition and all the examples we dealt with
iterating formulae ϕ(R, x) where x matched the arity of R. However, in
general one can imagine that ϕ has additional free variables. For example,
if we have a formula ϕ(R, x, y) positive in R, we can, for each tuple
b, deﬁne an operator Fb
ϕ(X) = {a | A |= ϕ(X/R, a, b)}, and a formula
ψ(t, y) ≡ [lfpR,xϕ(R, x, y)](t), with the semantics A |= ψ(c, b) iﬀ c ∈ lfp(Fb
ϕ).
It turns out, however, that free variables in ﬁxed point formulae can always
be avoided, at the expense of relations of higher arity. Indeed, the formula
ψ(t, y) above is equivalent to [lfpR′,x,yϕ′
(R′
, x, y)](t, y), where R′
is of arity
| x | + | y |, and ϕ′
is obtained from ϕ by changing every occurrence of a
subformula R(z) to R′
(z, y). This is left as an exercise to the reader. Thus,
we shall normally assume that no extra parameters are present in ﬁxed point
formulae.
10.3 Properties of LFP and IFP
In this section we study logics LFP and IFP. We start by introducing a very
convenient tool of simultaneous ﬁxed points, which allows one to iterate several
formulae at once. We then analyze ﬁxed point computations, and show
how to deﬁne and compare their stages (that is, sets Xi
as in (10.1)). From this
analysis we shall derive two important conclusions. One is that LFP = IFP on
ﬁnite structures. The other is a normal form for LFP, showing that nested occurrences
of ﬁxed point operators (which we saw in the multiplication example
in the previous section) can be eliminated.
Let σ be a relational vocabulary, and R1, . . . , Rn additional relation symbols,
with Ri being of arity ki. Let xi be a tuple of variables of length ki.
Consider a sequence Φ of formulae
ϕ1(R1, . . . , Rn, x1),
· · · ,
ϕn(R1, . . . , Rn, xn)
(10.4)
of vocabulary σ ∪ {R1, . . . , Rn}. Assume that all ϕi’s are positive in all Rj’s.
Then, for a σ-structure A, each ϕi deﬁnes an operator
Fi : ℘(Ak1
) × . . . × ℘(Akn
) → ℘(Aki
)
given by
10.3 Properties of LFP and IFP 185
Fi(X1, . . . , Xn) = {a ∈ Aki
| A |= ϕi(X1/R1, . . . , Xn/Rn, a)}.
We can combine these operators Fi’s into one operator
F : ℘(Ak1
) × . . . × ℘(Akn
) → ℘(Ak1
) × . . . × ℘(Akn
)
given by
F(X1, . . . , Xn) = (F1(X1, . . . , Xn), . . . , Fn(X1, . . . , Xn)).
A sequence of sets (X1, . . . , Xn) is a ﬁxed point of F if F(X1, . . . , Xn) =
(X1, . . . , Xn). Furthermore, if for every ﬁxed point (Y1, . . . , Yn) we have X1 ⊆
Y1, . . . , Xn ⊆ Yn, then we speak of the least ﬁxed point of F.
The product ℘(Ak1
)×. . .×℘(Akn
) is partially ordered component-wise by
⊆, and the operator F is component-wise monotone. Hence, it can be iterated
in the same way as usual monotone operators on ℘(U); that is,
X0
= (∅, . . . , ∅)
Xi+1
= F(Xi
)
X∞
=
∞
i=1
Xi
=
∞
i=1
Xi
1, . . . ,
∞
i=1
Xi
n .
(10.5)
Just as for the case of the usual operators on sets, one can prove that X∞
=
lfp(F). We then enrich the syntax of LFP with the rule that if Φ is a family
of formulae (10.4), and t is a tuple of terms of length ki, then
[lfpRi,Φ](t)
is a formula with the semantics A |= [lfpRi,Φ](a) iﬀ a belongs to the ith
component of X∞
. The resulting logic will be denoted by LFPsimult
.
As an example of a property expressible in LFPsimult
, consider the following
query Q on undirected graphs G = V, E : it returns the set of nodes (a, b)
such that there is a simple path of even length from a to b.
Let T be a ternary relation symbol, and R, S binary relation symbols. We
consider the following system Φ of formulae:
ϕ1(T, R, S, x, y, z) ≡
(E(x, y) ∧ ¬(x = z) ∧ ¬(y = z))
∨ ∃u E(x, u) ∧ T (u, y, z) ∧ ¬(x = z)
ϕ2(T, R, S, x, y) ≡
E(x, y)
∨ ∃u S(x, u) ∧ E(u, y) ∧ T (x, u, y)
ϕ3(T, R, S, x, y) ≡ ∃u R(x, u) ∧ R(u, y) ∧ T (x, u, y) .
Notice that these formulae are positive in R, S, T . We leave it to the reader to
verify that the simultaneous least ﬁxed point of this system Φ computes the
following relations:
186 10 Fixed Point Logics and Complexity Classes
• T ∞
(a, b, c) holds iﬀ there is a simple path from a to b that does not pass
through c;
• R∞
(a, b) holds iﬀ there is a simple path from a to b of odd length; and
• S∞
(a, b) holds iﬀ there is a simple path from a to b of even length.
Thus, [lfpS,Φ](x, y) expresses the query Q. (See Exercise 10.2.)
Simultaneous ﬁxed points are often convenient for expressing complex
properties, when several sets need to be deﬁned at once. The question is then
whether such ﬁxed points enrich the expressiveness of the logic. The answer,
as we are about to show, is negative.
Theorem 10.8. LFPsimult
= LFP.
Proof. We give the proof for the case of a system Φ consisting of two formulae,
ϕ1(R, S, x) and ϕ2(R, S, y). Extension to an arbitrary system is rather
straightforward, and left as an exercise for the reader (Exercise 10.3). The idea
is that we combine a simultaneous ﬁxed point into two ﬁxed point formulae,
in which the lfp operators are nested.
We need an auxiliary result ﬁrst. Assume we have two monotone operators
F1 : ℘(U) × ℘(V ) → ℘(U) and F2 : ℘(U) × ℘(V ) → ℘(V ).
Following (10.5), we deﬁne the stages of the operator (F1, F2) as X0
=
(X0
1 , X0
2 ) = (∅, ∅), Xi+1
= (Xi+1
1 , Xi+1
2 ) = (F1(Xi
), F2(Xi
)), with the ﬁxed
point (X∞
1 , X∞
2 ).
Fix a set Y ⊆ U, and deﬁne two operators:
FY
2 : ℘(V ) → ℘(V ), FY
2 (Z) = F2(Y, Z);
G1 : ℘(U) → ℘(U), G1(Y ) = F1(Y, lfp(FY
2 )).
Clearly, FY
2 is monotone, and hence lfp(FY
2 ) is well-deﬁned. The operator G1
is monotone as well (since for Y ⊆ Y ′
, it is the case that lfp(FY
2 ) ⊆ lfp(FY ′
2 )),
and hence it has a least ﬁxed point.
To prove the theorem, we need the following lemma, which is sometimes
referred to as the Bekic principle.
Lemma 10.9. X∞
1 = lfp(G1).
Before we prove the lemma, we show that the theorem follows from it.
Since X∞
1 = lfp(G1), we have to express G1 in lfp, which can be done, as G1
is deﬁned as the least ﬁxed point of a certain operator. In fact, it follows from
the deﬁnition of G1 that [lfpR,Φ](t) is equivalent to
lfpR,x ϕ1 R, [lfpS,yϕ2(R, S, y)] / S, x (t).
10.3 Properties of LFP and IFP 187
The roles of F1 and F2 can be reversed; that is, we can deﬁne FY
1 (Z) =
F1(Z, Y ) : ℘(U) → ℘(U) and G2 : ℘(V ) → ℘(V ) by G2(Y ) = F2(lfp(FY
1 ), Y ),
and prove, as in Lemma 10.9, that X∞
2 = lfp(G2). Therefore,
lfpS,y ϕ2 [lfpR,xϕ1(R, S, x)] / R, S, y (t)
is equivalent to [lfpS,Φ](t).
It remains to prove Lemma 10.9. First, notice that lfp(F
X∞
1
2 ) ⊆ X∞
2 ,
because F
X∞
1
2 (X∞
2 ) = F2(X∞
1 , X∞
2 ) = X∞
2 . That is, X∞
2 is a ﬁxed point
of F
X∞
1
2 , and thus it must contain its least ﬁxed point. Hence, G1(X∞
1 ) =
F1(X∞
1 , lfp(F
X∞
1
2 )) ⊆ F1(X∞
1 , X∞
2 ) = X∞
1 . Since lfp(G1) is the intersection
of all the set S such that G1(S) ⊆ S, we conclude that lfp(G1) ⊆ X∞
1 .
Next, we prove the reverse inclusion X∞
1 ⊆ lfp(G1). We use Z to denote
lfp(G1). We show inductively that for each i, Xi
1 ⊆ Z and Xi
2 ⊆ lfp(FZ
2 ).
This is clear for i = 0. To go from i to i + 1, calculate
Xi+1
1 = F1(Xi
1, Xi
2) ⊆ F1(Z, lfp(FZ
2 )) = G1(lfp(G1)) = lfp(G1) = Z,
and
Xi+1
2 = F2(Xi
1, Xi
2) ⊆ F2(Z, lfp(FZ
2 )) = FZ
2 (lfp(FZ
2 )) = lfp(FZ
2 ).
Thus,
X∞
1 =
∞
i=0
Xi
1 ⊆ lfp(G1).
This completes the proof of Lemma 10.9 and Theorem 10.8.
One can similarly deﬁne logics IFPsimult
and PFPsimult
, by allowing simultaneous
inﬂationary and partial ﬁxed points. It turns out that for IFP and
PFP, simultaneous ﬁxed points do not increase expressiveness either. The
proof presented for LFP would not work, as it relies on the monotonicity
of operators deﬁned by formulae, which cannot be guaranteed for arbitrary
formulae used in the deﬁnition of the logics IFP and PFP. Nevertheless, a
diﬀerent technique works for these logics. We explain it now by means of an
example; details are left as an exercise for the reader.
Assume that the vocabulary σ has two constant symbols c1 and c2 interpreted
as two distinct elements of σ-structure. This assumption is easy to get
rid of, by existentially quantifying over two variables, u and w, and stating
that u = w; however, formulae with constants will be easier to deal with.
Furthermore, we can assume without loss of generality that structures have
at least two elements, since the case of one-element structures can be dealt
with explicitly by specifying the value of a ﬁxed point operator on them.
Suppose we have two formulae, ϕ1(R1, R2, x) and ϕ2(R1, R2, x), where the
arities of R1 and R2 are n, and the length of x is n. Let S be a relation symbol
of arity n + 1, and let ψ(S, u, x) be the formula
188 10 Fixed Point Logics and Complexity Classes
(u = c1) ∧ ϕ1 S(c1, z)/R1(z), S(c2, z)/R2(z), x
∨ (u = c2) ∧ ϕ2 S(c1, z)/R1(z), S(c2, z)/R2(z), x ,
where S(ci, z)/Ri(z) indicates that every occurrence of Ri(z) is replaced by
S(ci, z). Then the ﬁxed point – inﬂationary or partial – of this formula ψ
computes the simultaneous ﬁxed point of the system {ϕ1, ϕ2}: the ﬁxed point
corresponding to Ri is the set of all n-tuples of the ﬁxed point of ψ where the
ﬁrst coordinate is ci.
This argument is generalized to arbitrary systems of formulae, thereby
giving us the following result.
Theorem 10.10. IFPsimult
= IFP and PFPsimult
= PFP.
We now come back to single ﬁxed point deﬁnitions and analyze them in
detail. Suppose we have a formula ϕ(R, x). Assume for now that ϕ is positive
in R. To construct the least ﬁxed point of ϕ on a structure A, we inductively
calculate X0
= ∅, Xi+1
= Fϕ(Xi
), and then the ﬁxed point is X∞
= i Xi
.
We shall refer to Xi
’s as stages of the ﬁxed point computation, with Xi
being
the ith stage.
First, we note that each stage is deﬁnable by an LFP formula, if ϕ is
positive in R. Indeed, for each stage i, we have a formula ϕi
(xi), such that
ϕi
(A) is exactly Xi
. These are deﬁned inductively as follows:
ϕ0
(x0) ≡ ¬(x = x) x is a variable in x0
ϕi+1
(xi+1) ≡ ϕ(ϕi
/R, xi+1).
(10.6)
Here the notation ϕ(ϕi
/R, xi+1) means that every occurrence R(y) in ϕ
is replaced by ϕi
(y) and, furthermore, all the bound variables in ϕ have
been replaced by fresh ones. For example, consider the formula ϕ(R, x, y) ≡
E(x, y) ∨ ∃z E(x, z) ∧ R(z, y) . Following (10.6), we obtain the formulae
ϕ0
(x0, y0) ≡ ¬(x0 = x0)
ϕ1
(x1, y1) ≡ E(x1, y1) ∨ ∃z1 (E(x1, z1) ∧ ϕ0
(z1, y1))
↔ E(x1, y1)
ϕ1
(x2, y2) ≡ E(x2, y2) ∨ ∃z2 (E(x2, z2) ∧ ϕ1
(z2, y2))
↔ E(x2, y2) ∨ ∃z2 (E(x2, z2) ∧ E(z2, y2))
. . . . . .
computing the stages of the transitive closure operator.
For an arbitrary ϕ, we can give formulae for computing stages of the
inﬂationary ﬁxed point computation. These are given by
ϕ0
(x0) ≡ ¬(x = x)
ϕi+1
(xi+1) ≡ ϕi
(xi+1) ∨ ϕ(ϕi
/R, xi+1).
(10.7)
10.3 Properties of LFP and IFP 189
Thus, each stage of the inﬂationary ﬁxed point computation is deﬁnable
by an IFP formula.
What is more interesting is that we can write formulae that compare stages
at which various tuples get into the sets Xi
of ﬁxed point computations.
Suppose we are given a formula ϕ(R, x) that gives rise to an inductive operator
Fϕ, where R is k-ary and x has k variables. For example, if we are interested
in inﬂationary ﬁxed point computation, we can always pass from ϕ(R, x) to
R(x) ∨ ϕ(R, x), whose induced operator is inductive.
Given a structure A, we deﬁne |ϕ|A
as the least n such that Xn
= X∞
.
Furthermore, for a tuple a ∈ Ak
, we deﬁne |a|A
ϕ as the least number i such
that a ∈ Xi
in the ﬁxed point computation, and |ϕ|A
+ 1 if no such i exists.
Notice that if ϕ is positive in R, then the stages of the least and inﬂationary
ﬁxed point computation are the same.
We next deﬁne two relations ≺ϕ
and ϕ
on Ak
as follows:
a ≺ϕ
b ≡ |a|A
ϕ < |b|A
ϕ ,
a ϕ
b ≡ |a|A
ϕ ≤ |b|A
ϕ and |a|A
ϕ ≤ |ϕ|A
.
The theorem below shows that these can be deﬁned with least ﬁxed points
of positive formulae.
Theorem 10.11 (Stage comparison). If ϕ is in LFP, then the binary
relations ≺ϕ
and ϕ
are LFP-deﬁnable.
Proof. The idea of the proof is as follows. We want to deﬁne both ≺ϕ
and
ϕ
as a simultaneous ﬁxed point. This has to be done somehow from ϕ, but
in ϕ we may have both positive and negative occurrences of R. So to ﬁnd
some relations to substitute for the negative occurrences of R, we explicitly
introduce the complements of ≺ϕ
and ϕ
:
a ≺ϕ
b ≡ |a|A
ϕ ≥ |b|A
ϕ ,
a ϕ
b ≡ |a|A
ϕ > |b|A
ϕ or |a|A
ϕ = |ϕ|A
+ 1.
We shall be using formulae of the form
ϕ(≺(y)/R, x) and ϕ( (y)/R, x).
This means that, for ϕ(≺ (y)/R, x), every positive occurrence R(z) of R is
replaced by z ≺ϕ
y, and every negative occurrence of R(z) of R is replaced by
z ≺ϕ
y, and likewise for ϕ
. Note that all the occurrences of the four relations
≺ϕ
, ϕ
, ≺ϕ
, ϕ
become positive. Also, we shall write
ϕ(¬≺(y)/R, x),
meaning that every positive occurrence R(z) of R is replaced by ¬(z ≺ϕ
y),
and every negative occurrence of R(z) of R is replaced by ¬(z ≺ϕ
y). These
190 10 Fixed Point Logics and Complexity Classes
will be used in subformulae ¬ϕ(¬ ≺ (y)/R, x), again ensuring that all the
occurrences of ≺ϕ
, ϕ
, ≺ϕ
, ϕ
are positive.
These four relations will be deﬁned by a simultaneous ﬁxed point. For
technical reasons, we shall add one more relation:
a ⊳ϕ
b ≡ |a|A
ϕ + 1 = |b|A
ϕ,
and show how to deﬁne (≺, , ⊳, ≺, ) by a simultaneous ﬁxed point. For
readability only, we may omit the superscript ϕ. We deﬁne the system Ψ of
ﬁve formulae ψi(≺, , ⊳, ≺, , x, y), i = 1, . . . , 5, as follows:
ψ1 ≡ ∃z x z ∧ z ⊳ y ,
ψ2 ≡ ϕ(≺(y)/R, x),
ψ3 ≡ ϕ(≺(x)/R, x) ∧ ¬ϕ(≺(x)/R, y) (10.8)
∧ ϕ( (x)/R, y) ∨ ∀z ¬ϕ(¬ (x)/R, z) ∨ ϕ(≺(x)/R, z) ,
ψ4 ≡ ∃z x z ∧ z ⊳ y ∨ ϕ(∅/R, y) ∨ ∀z¬ϕ(∅/R, z),
ψ5 ≡ ¬ϕ(¬ ≺(y)/R, x)
where ϕ(∅/R, ·) means that all occurrences of R are eliminated and replaced
by false.
Note that all the occurrences of ≺, , ⊳, ≺, in Ψ are positive. We next
claim that the simultaneous least ﬁxed point of Ψ indeed deﬁnes ≺ϕ
, ϕ
,
⊳ϕ
, ≺ϕ
, ϕ
.
To prove the result, we have to show that (≺ϕ
, ϕ
, ⊳ϕ
, ≺ϕ
, ϕ
) satisfy
(10.8), and that for each ∗ ∈ {≺ϕ
, ϕ
, ⊳ϕ
, ≺ϕ
, ϕ
}, if a ∗ b holds, then (a, b)
is in the corresponding ﬁxed point of Ψ (10.8). This will be proved by induction
on |b|A
ϕ .
Below, we prove a few cases for both directions. The remaining cases are
very similar, and are left as an exercise for the reader.
First, we prove that ⊳ϕ
satisﬁes (10.8). Consider a tuple (a, b) in this
relation. The result is immediate if |a|A
ϕ = |ϕ|A
+ 1. If |a|A
ϕ < |ϕ|A
, then
the third conjunct in ψ3(a, b) is equivalent to ϕ( ϕ
(a)/R, b) and, therefore,
ψ3(a, b) holds iﬀ |b|A
ϕ = |a|A
ϕ + 1 iﬀ a ⊳ϕ
b. Finally, if |a|A
ϕ = |ϕ|A
, then the
third conjunct in ψ3 is equivalent to the formula ∀z (¬ϕ(¬ ϕ
(a)/R, z)∨
ϕ(≺ϕ
(a)/R, z)) and, thus, ψ3(a, b) holds iﬀ b is not in the ﬁxed point of ψ3
iﬀ |b|A
ϕ = |ϕ|A
+ 1 = |a|A
ϕ + 1.
Second, we prove by induction on |b|A
ϕ that, for every a, if a⊳ϕ
b or a ≺ϕ
b,
then (a, b) is in the corresponding ﬁxed point of Ψ.
Induction Basis: |b|A
ϕ = 1.
• The case for ⊳ϕ
. This is the simplest case, since |b|A
ϕ = 1 implies that
a ⊳ϕ
b holds for no a.
10.3 Properties of LFP and IFP 191
• The case for ≺ϕ
. Since |b|A
ϕ = 1, we conclude that ϕ(∅/R, b) holds. We
have a ≺ϕ
b for all a, and since ϕ(∅/R, b) is true, (a, b) is in the ﬁxed
point of ψ4 for every a.
Induction Step: Assume that |b|A
ϕ = k + 1 and that the property holds for
all c such that |c|A
ϕ ≤ k.
• The case for ⊳ϕ
. Suppose that a ⊳ϕ
b. Then |a|A
ϕ ≤ k. We show that the
three conjuncts in ψ3 hold for (a, b) and, thus, we conclude that (a, b) is
in the ﬁxed point of ψ3.
Since |a|A
ϕ < |b|A
ϕ, we have |a|A
ϕ ≤ |ϕ|A
and, therefore, ϕ(≺ϕ
(a)/R, a)
holds. By the induction hypothesis, ≺ϕ
(a) =≺(a), so ϕ(≺(a)/R, a) holds.
Since |a|A
ϕ < |b|A
ϕ , ¬ϕ(¬ ≺ϕ
(a)/R, b) holds. By the induction hypothesis,
≺ϕ
(a) =≺(a) and, hence, ¬ϕ(¬ ≺(a)/R, b) holds.
To prove that the third conjunct in ψ3 holds, we consider two cases. If
|b|A
ϕ ≤ |ϕ|A
, then ϕ( ϕ
(a)/R, b) holds. By the hypothesis, ϕ
(a) = (a)
and, therefore, ϕ( (a)/R, b) holds. Otherwise |b|A
ϕ = |ϕ|A
+ 1 and |a|A
ϕ =
|ϕ|A
. In this case all the elements generated at stage |a|A
ϕ + 1 are already
in stage |a|A
ϕ and, therefore, the formula ∀z (¬ϕ(¬ ϕ
(a)/R, z) ∨ ϕ(≺
ϕ
(a)/R, z)) holds. As in the previous cases, by the induction hypothesis
we conclude that ∀z (¬ϕ(¬ (a)/R, z) ∨ ϕ(≺(a)/R, z)) holds.
• The case for ≺ϕ
. Suppose that a ≺ϕ
b, and that the second and third
disjuncts in ψ4 do not hold. Then we show that the ﬁrst disjunct in ψ4
holds and conclude that (a, b) is in the ﬁxed point of ψ4.
Since ϕ(∅/R, b) and ∀z¬ϕ(∅/R, z) do not hold, we have |b|A
ϕ > 1 and
the ﬁxed point of ψ4 contains at least one element. Thus, there exists c
such that c ⊳ϕ
b.
Given that a ≺ϕ
b, we have a ϕ
c and |c|A
ϕ ≤ k. Therefore, we have a
tuple c with |c|A
ϕ ≤ k such that both a ϕ
c and c⊳ϕ
b hold. Now using the
equivalence from the previous case for c ⊳ϕ
b, and applying the induction
hypothesis to a ϕ
c, we conclude that (a, b) satisﬁes ∃z (a z ∧ z ⊳ b),
which ﬁnishes the proof.
Corollary 10.12 (Gurevich-Shelah). IFP = LFP.
Proof. The inclusion LFP ⊆ IFP is immediate. For the converse, proceed by
induction on the formulae. The only case to consider is ifpR,xϕ(R, x). We can
assume, without loss of generality, that ϕ deﬁnes an inductive operator (if
not, consider R(x) ∨ ϕ). Then [ifpR,xϕ(R, x)](t) is equivalent to
ϕ(≺ϕ
(t)/R, t ),
which, by the stage comparison theorem, is an LFP formula.
192 10 Fixed Point Logics and Complexity Classes
As another corollary of stage comparison, we establish a normal form for
LFP formulae. Deﬁne a logic LFP0 which extends FO with the following. If
Φ is a system of FO formulae ϕi(R1, . . . , Rn, x) positive in all the Ri’s, then
[lfpRi,Φ](x) is an LFP0 formula. Note the diﬀerence between this and general
LFP: we only allow ﬁxed points to be applicable to FO formulae, and we do
not close those ﬁxed points under the Boolean connectives and quantiﬁcation.
In other words, every formula of LFP0 is either FO, or of the form [lfpRi,Φ](x),
where Φ consists of FO formulae.
Corollary 10.13. LFP = LFP0.
Proof. We ﬁrst show that LFP0 is closed under ∨, ∧, and ¬. For ∨ and ∧
this is easy: just introduce an extra relation to hold the union or intersection
of two ﬁxed points. For example, given ϕ1(R1, x) and ϕ2(R2, x), we deﬁne
a system Φ that consists of formulae ϕ1(R1, R2, S, x), ϕ2(R1, R2, S, x), and
ϕ3(R1, R2, S, x) ≡ (R1(x) ∨ R2(x)). Then lfpS,Φ is the union of ﬁxed points
of ϕ1 and ϕ2.
The closure under negation follows from the stage comparison:
¬[lfpR,xϕ](t) is equivalent to t ϕ
t.
The closure of LFP0 under ﬁxed point operators is immediate (one simply
adds an extra formula to the system). Thus, LFP0 = LFP.
10.4 LFP, PFP, and Polynomial Time and Space
The goal of this section is to show that the ﬁxed point logics we introduced
capture familiar complexity classes over ordered structures. A structure is
ordered if one of the symbols of its vocabulary σ is <, interpreted as a linear
order on the universe. Recall that we used a linear order for deﬁning an
encoding of a structure: indeed, a string on the tape of a Turing machine is
naturally ordered from left to right. For capturing NP and the polynomial
hierarchy, we did not need the assumption that the structures are ordered,
since we could guess an order by second-order quantiﬁers. However, ﬁxed point
logics are not suﬃciently expressive for guessing a linear order (in fact, this
will be proved formally).
Theorem 10.14 (Immerman-Vardi). Both LFP and IFP capture Ptime
over the class of ordered structures. That is,
LFP+< = IFP+< = Ptime.
Proof. By the Gurevich-Shelah theorem (Corollary 10.12), we can use IFP
and LFP interchangeably. First, we show that LFP formulae can be evaluated
in polynomial time. The proof is by induction on the formulae. The cases of
the Boolean connectives and quantiﬁers are handled in exactly the same way
10.4 LFP, PFP, and Polynomial Time and Space 193
as for FO (see, e.g., Proposition 6.6). For formulae of the form lfpR,xϕ, it
suﬃces to observe the following: if F : ℘(U) → ℘(U) is a Ptime-computable
monotone operator, then lfp(F) can be computed in polynomial time in |U |.
Indeed, we know that the ﬁxed point computation stops after at most | U |
iterations, and each iteration is Ptime-computable. Hence, every LFP formula
can be evaluated in polynomial time.
For the converse, we use the same technique as in the proofs of Trakhtenbrot’s
and Fagin’s theorems. Suppose we are given a property P of σ-structures
which can be tested, on encodings of σ-structures, by a deterministic polynomial
time Turing machine M = (Q, Σ, ∆, δ, q0, Qa, Qr) with a one-way inﬁnite
tape. We assume, without loss of generality, that there is only one accepting
state, qa, that Σ = {0, 1}, and that ∆ extends Σ with the blank symbol.
Let M run in time nk
. As before, we assume that nk
exceeds the size of the
encodings of n-element structures.
With the linear order <, we can again deﬁne the lexicographic linear order
≤k on k-tuples, and use the ordered k-tuples to model both positions of M and
time. We shall deﬁne, by means of ﬁxed point formulae, the 2k-ary predicates
T0, T1, T2, (Hq)q∈Q, where Ti(p, t) indicates that position p at time t contains
i, for i = 0, 1, and blank, for i = 2, and Hq(p, t) indicates that at time t,
the machine is in state q, and its head is in position p. We shall provide a
system Ψ of formulae whose simultaneous inﬂationary ﬁxed point is exactly
(T0, T1, T2, (Hq)q∈Q). Once we have such a system, the sentence testing P will
be given by
∃p ∃t [ifpHqa ,Ψ ](p, t ). (10.9)
Since IFPsimult
= IFP and IFP = LFP, the formula (10.9) can be expressed
in LFP.
The system Ψ contains formulae ψi(p, t, T0, T1, T2, (Hq)q∈Q), i = 0, 1, 2,
deﬁning Ti’s, and ψq(p, t, T0, T1, T2, (Hq)q∈Q), q ∈ Q, deﬁning Hq’s. It has the
property that the jth iteration for each of the relations it deﬁnes, Rj
, contains
{(p, t) | R(p, t) and t < j}, where t < j means that t is among the ﬁrst j − 1
k-tuples in the lexicographic ordering <k. That is, we build the relations Ti’s
and Hq’s in stages, where the jth stage represents the conﬁguration at times
up to j − 1.
The formulae ψi are straightforward to write, and we only sketch a few of
them. The formula ψ0 is of the form
t = 0 ∧ ¬ι(p) ∧ ¬ξ(p) ∨ ¬(t = 0) ∧ α0(t − 1, p, T0, T1, (Hq)q∈Q) .
Here ι and ξ are formula from the proof of Fagin’s theorem (ι holds iﬀ the
pth position of the encoding of the input is 1, and ξ holds iﬀ p is past the last
position of the encoding of the input on the tape). Thus the ﬁrst disjunct says
that at time 0, the tape of M contains the encoding of the input structure.
The formula α0(t − 1, p, T0, T1, (Hq)q∈Q) lists conditions under which at the
following time instant, t, the position p will contain zero. It is similar to the
194 10 Fixed Point Logics and Complexity Classes
formulae we used for modeling M’s transitions in the proof of Fagin’s theorem.
The formula ψ1 is similar to ψ0.
The formula ψq0 is of the form
t = 0 ∧ p = 0 ∨ ¬(t = 0) ∧ αq0 (t − 1, p, T0, T1, (Hq)q∈Q) ,
and other ψq’s are of the form (t > 0) ∧ αq(t − 1, p, T0, T1, (Hq)q∈Q), where
αq again lists conditions under which at the next time instant, M will enter
state q while having the head pointing at p. The ﬁrst disjunct in ψq0 states
that at time 0, M is in state q0 with its head in position 0.
We leave it as a routine exercise to the reader to write the αi’s and αq’s,
based on M’s transitions, and verify that that jth stage of the ﬁxed point computation
for the system Ψ indeed computes the conﬁguration of M for times
not exceeding j − 1. Hence, the ﬁxed point formula (10.9) checks membership
in P, which completes the proof.
Note that using inﬂationary ﬁxed points instead of least ﬁxed points in
the proof of Theorem 10.14 gives us extra freedom in writing down formulae
of the system Ψ: we do not have to ensure that these are positive in Ti’s and
Hq’s. However, one can write those formulae carefully so that they would be
positive in all those relation symbols. In that case, one can replace ifp with
lfp in (10.9). Hence, the proof of Theorem 10.14 then shows that every LFPdeﬁnable
property over ordered structures can be deﬁned by a formula of the
form
∃x [lfpRi,Ψ ](x),
where Ψ is a system of FO formulae positive in relation symbols R1, . . . , Rn.
This, of course, would follow from Corollary 10.13, stating that LFP = LFP0,
but notice that for ordered structures, we obtained the normal form result
without using the stage comparison theorem.
We have seen that for several logics, adding an order increases their expressiveness;
that is, L (L+ <)inv for L being FO, or one of its counting
extensions, or MSO. The same is true for LFP, IFP, and PFP; the proof of
this will be given in the next chapter when we describe additional tools such
as ﬁnite variable logics and pebble games. At this point we only say that
the query that separates these logics on ordered and unordered structures is
even: it is not expressible in any of the ﬁxed point logics without a linear
order, but is obviously already in LFP+<, since it is Ptime-computable.
We conclude this section by considering the partial ﬁxed point logic, PFP.
Over ordered structures, it corresponds to another well-known complexity
class.
Theorem 10.15. Over ordered structures, PFP captures Pspace.
10.5 Datalog and LFP 195
The proof, of course, follows the proofs of Trakhtenbrot’s, Fagin’s, and
Immerman-Vardi’s theorems. We only explain why PFP formulae can be evaluated
in Pspace. Consider pfpR,xϕ(R, x), where R is k-ary, and let Xi
’s be
the stages of the partial ﬁxed point computation on A with |A|= n. There are
two possibilities. Either Xm+1
= Xm
for some m, in which case a ﬁxed point
is reached. Otherwise, for some 0 ≤ i, j ≤ 2nk
, i + 1 < j, we have Xi
= Xj
,
and in this case the formula [pfpR,xϕ(R, x)](t) would evaluate to false, since
the partial ﬁxed point is the empty set. Hence, one has to check which of
these cases is true. For that, it suﬃces to enumerate all the subsets of Ak
,
one by one (which can be done in Pspace), and proceed with computing the
sequence Xi
, checking whether a ﬁxed point is reached. Since only 2nk
steps
need to be made, the entire computation is in Pspace.
To show that Pspace ⊆ PFP+<, one modiﬁes the proof of the
Immerman-Vardi theorem, to simulate the accepting condition of a Turing
machine by means of a partial ﬁxed point formula. We leave the details to the
reader (Exercise 10.9).
10.5 Datalog and LFP
In this section we review a database query language Datalog, and relate it
to ﬁxed point logics.
Recall that FO is used as the basic relational query language (it is known
under the name relational calculus in the database literature). Conjunctive
queries, seen in Sect. 6.7, constitute an important subclass of FO queries.
They can be deﬁned in the fragment of FO that only includes conjunction ∧
and existential quantiﬁcation ∃. There is another convenient form for writing
conjunctive queries that in fact is used most often in the literature. Instead of
ψ(x) ≡ ∃y i αi(x, y), one omits the existential quantiﬁers and replaces the
∧’s with commas:
Rψ(x) :– α1(x, y), α2(x, y), . . . , αm(x, y). (10.10)
Here Rψ is a new relation symbol; the meaning of (10.10) is that, for a given
structure A, this new relation contains the set of all tuples a such that A |=
ψ(a).
Expressions of the form (10.10) are called rules; the part of the rule that
appears on the left of the :– (in this case, Rψ(x)) is called its head, and the
part of the rule on the right of the :– is called its body. A rule is converted into
a conjunctive query by replacing commas with conjunctions, and existentially
quantifying all the variables that appear in the body but not in the head.
For example, the rule
q(x, y) :– E(x, z), E(z, v), E(v, y)
is translated into ∃z∃v E(x, z) ∧ E(z, v) ∧ E(v, y) .
196 10 Fixed Point Logics and Complexity Classes
Datalog programs contain several rules some of which may be recursive:
that is, the same predicate symbol may appear in both the head and the body
of a rule. A typical Datalog program would be of the following form:
trcl(x, y) :– E(x, y)
trcl(x, y) :– E(x, z), trcl(z, y)
(10.11)
This program computes the transitive closure of E: it says that (x, y) is in
the transitive closure if there is an edge (x, y), or there is an edge (x, z) such
that (z, y) is in the transitive closure. As with the ﬁxed point deﬁnition of the
transitive closure, to evaluate this program we iterate this deﬁnition, starting
with the empty set, until a ﬁxed point is reached.
Deﬁnition 10.16. A Datalog program over vocabulary σ is a pair (Π, Q),
where Π is a set of rules of the form
P(x) :– α1(x, y), . . . , αm(x, y). (10.12)
Here the relation symbol P in the head of rule (10.12) does not occur in σ,
and each αi is an atomic formula of the form R(x, y), for R ∈ σ, or P′
(x, y),
for P′
that occurs as a head of one of the rules of Π. Furthermore, Q is the
head of one of the rules of Π.
By Datalog¬ we mean the extension of Datalog where negated atomic
formulae of the form ¬R(·), for R ∈ σ, can appear in the bodies of rules
(10.12).
For example, the transitive closure program consists of the rules (10.11),
and trcl is the output predicate Q.
In the standard Datalog terminology, relation symbols from σ are called
extensional predicates, and symbols not in σ that appear as heads of rules
are called intensional predicates. These are the predicates computed by the
program, and Q is its output.
To deﬁne the semantics of a Datalog (or Datalog¬) program (Π, Q),
we introduce the immediate consequence operator FΠ. Let P1, . . . , Pk list all
the intensional predicates (with Q being one of them). Let ni be the arity of
Pi, i = 1, . . . , k. Let
Pi(x) :– γ1
1 (x, y1), . . . , γ1
m1
(x, y1)
· · · · · · · · · · · ·
Pi(x) :– γl
1(x, yl), . . . , γl
ml
(x, yl)
(10.13)
enumerate all the rules in Π with Pi as the head.
Given a structure A and a tuple of sets Y = (Y1, . . . , Yk), Yi ⊆ Ani
,
i = 1, . . . , k, we deﬁne FΠ(Y ) = (Z1, . . . , Zk), where
Zi = a ∈ Ani
(A, Y1, . . . , Yk) |=
l
j=1
∃yj γj
1(a, yj) ∧ . . . ∧ γj
mj
(a, yj) ,
10.5 Datalog and LFP 197
where formulae γj
l are the formulae from the rules (10.13) for the intensional
predicate Pi. In other words, a ∈ Zi can be derived by applying one of the
rules of Π whose head is Pi, using Y as the interpretation for the intensional
predicates.
Since the formula above is positive in all the intensional predicates (even
for a Datalog¬ program), the operator FΠ is monotone. Hence, starting with
(∅, . . . , ∅) and iterating this operator, we reach the least ﬁxed point lfp(FΠ ) =
(P∞
1 , . . . , P∞
k ). The output of (Π, Q) on A is deﬁned as Q∞
(recall that Q is
one of the Pi’s).
Returning to the transitive closure example, the stages of the ﬁxed point
computation of the immediate consequence operator are exactly the same as
the stages of computing the least ﬁxed point of E(x, y)∨∃z (E(x, z)∧R(z, y)),
and hence, on an arbitrary ﬁnite graph, the program (10.11) computes its
transitive closure.
Analyzing the semantics of a Datalog program (Π, Q), we can see that
it is simply a simultaneous least ﬁxed point of a system Ψ of formulae
ψi(x, P1, . . . , Pk) ≡
j
∃yj γj
1(a, yj) ∧ . . . ∧ γj
mj
(a, yj) . (10.14)
That is, the answer to (Π, Q) on A is
{a | A |= [lfpQ,Ψ ](a) }.
Hence, each Datalog or Datalog¬ program can be expressed in LFPsimult
,
and thus in LFP.
What fragment of LFP does Datalog¬ correspond to? The special form
of formulae ψi (10.14) indicates that there are some syntactic restrictions
on LFP formulae into which Datalog¬ is translated. We can capture these
syntactic restrictions by a notion of existential least ﬁxed point logic.
Deﬁnition 10.17. The existential least ﬁxed point logic, ∃LFP, over vocabulary
σ, is deﬁned as a restriction of LFP over σ, where:
• negation can only be applied to atomic formulae of vocabulary σ (i.e.,
formulae R(·), where R ∈ σ), and
• universal quantiﬁcation is not allowed.
Theorem 10.18. ∃LFP = Datalog¬.
Proof. We have seen one direction already, since every Datalog¬ query can
be translated into one simultaneous ﬁxed point of a system of FO formulae ψi
(10.14), in which no universal quantiﬁers are used, and negation only applies
to atomic σ-formulae. Elimination of the simultaneous ﬁxed point introduces
no negation and no universal quantiﬁcation, and hence Datalog¬ ⊆ ∃LFP.
198 10 Fixed Point Logics and Complexity Classes
For the converse, we translate each ∃LFP formula ϕ(x1, . . . , xk) into an
equivalent Datalog¬ program (Πϕ, Qϕ), which, on any structure A, computes
Q∞
ϕ = ϕ(A). Moreover, the translation ensures that no relation symbol
that appears positively in ϕ is negated in Πϕ. The translation proceeds by
induction on the structure of the formulae as follows:
• If ϕ(x) is an atomic or negated atomic formula (i.e., R(x) or ¬R(x)), then
Πϕ contains one rule Qϕ(x) :– ϕ(x).
• If ϕ ≡ α ∧ β, then
Πϕ = Πα ∪ Πβ ∪ {Qϕ(x) :– Qα(x), Qβ(x)}.
• If ϕ ≡ α ∨ β, then
Πϕ = Πα ∪ Πβ ∪ {Qϕ(x) :– Qα(x), Qϕ(x) :– Qβ(x)}.
• If ϕ(x) ≡ ∃yα(y, x), then
Πϕ = Πα ∪ {Qϕ(x) :– Qα(y, x)}.
• Let ϕ(x) ≡ [lfpR,yα(R, y)](x). By the induction hypothesis, we have
a program (Πα, Qα) for α; notice that R appears positively in α, and
thus does not appear negated in Πα. Hence, we can deﬁne the following
program, in which R is an intensional predicate:
Πϕ = Πα ∪ {R(y) :– Qα(y), Qϕ(x) :– R(x)},
and which computes the least ﬁxed point of α.
Thus, Datalog and Datalog¬ correspond to syntactic restrictions of
LFP. But could they still be suﬃcient for capturing Ptime?
Let us ﬁrst look at a Datalog program (Π, Q), and suppose we have two
σ-structures, A1 and A2, on the same universe A, such that for every symbol
R ∈ σ, we have RA1
⊆ RA2
. Then a straightforward induction on the stages
of the immediate consequence operator shows that (Π, Q)[A1] ⊆ (Π, Q)[A2],
where by (Π, Q)[A] we denote the result of (Π, Q) on A. Hence, Datalog
only expresses monotone properties, and thus cannot capture Ptime (exercise:
exhibit a non-monotone Ptime property).
Queries expressible in Datalog¬ satisfy a slightly diﬀerent monotonicity
property. Suppose A is a substructure of B; that is, A ⊆ B, and for each R ∈ σ,
RA
is the restriction of RB
to A. Then (Π, Q)[A] ⊆ (Π, Q)[B], where (Π, Q)
is a Datalog¬ program. Indeed, when you look at the formulae (10.14), it is
clear that if a witness a is found in A, it will be a witness for the existential
quantiﬁers in B. Since it is again not hard to ﬁnd a Ptime property that
fails this notion of monotonicity, Datalog¬ fails to capture Ptime. Furthermore,
even adding order preserves monotonicity, and hence Datalog¬ fails
to capture Ptime even over ordered structures.
10.6 Transitive Closure Logic 199
But now assume that on all the structures, we have a successor
relation succ available, as well as constants min, max for the minimal
and maximal element with respect to the successor relation. It
is impossible for A, succA
, minA
, maxA
, . . . to be a substructure of
B, succB
, minB
, maxB
, . . . , and hence the previous monotonicity argument
does not work. In fact, the following theorem can be shown.
Theorem 10.19. Over structures with successor relation and constants for
the minimal and maximal elements, Datalog¬ captures Ptime.
The proof mimics the proofs of Fagin’s and Immerman-Vardi’s theorems,
by directly coding deterministic polynomial time Turing machines in
Datalog¬, and is left to the reader as an exercise.
10.6 Transitive Closure Logic
One of the standard examples of queries expressible in LFP is the transitive
closure. In this section, we study a logic based on the transitive closure
operator, rather than the least or inﬂationary ﬁxed point, and prove that it
corresponds to a well-known complexity class.
Deﬁnition 10.20. The transitive closure logic TrCl is deﬁned as an extension
of FO with the following formation rule: if ϕ(x, y, z) is a formula, where
|x|=|y|= k, and t1, t2 are tuples of terms of length k, then
[trclx,yϕ(x, y, z)](t1, t2)
is a formula whose free variables are z plus the free variables of t1, t2.
The semantics is deﬁned as follows. Given a structure A, values a for z
and ai for ti, i = 1, 2, construct the graph G on Ak
with the set of edges
{(b1, b2) | A |= ϕ(b1, b2, a)}.
Then
A |= [trclx,yϕ(x, y, a)](a1, a2)
iﬀ (a1, a2) is in the transitive closure of G.
For example, connectivity of directed graphs can be expressed by the TrCl
formula ∀u∀v [trclx,y(E(x, y) ∨ E(y, x))](u, v).
We now state the main result of this section.
Theorem 10.21. Over ordered structures, TrCl captures NLog.
200 10 Fixed Point Logics and Complexity Classes
Having seen a number of results of this type, one might be tempted to
think that the proof is by a simple modiﬁcation of the proofs of Trakhtenbrot’s,
Fagin’s, and Immerman-Vardi’s theorems. However, in this case we are
running into problems, and the problems arise in the “easy” part of the proof:
TrCl ⊆ NLog.
It is well known that the transitive closure of a graph can be computed
by a nondeterministic logspace machine. Hence, trying to show the inclusion
TrCl ⊆ NLog by induction on the structure of the formulae, we have no
problems with the transitive closure operator. The problematic operation is
negation. Since NLog is a nondeterministic class, acceptance means that some
computation ends in an accepting state. The negation of this statement is that
all computations end in rejecting states, and it is not clear whether this can be
reformulated as an existential statement. Our strategy for proving Theorem
10.21 is to split it into two statements. First, we deﬁne a logic posTrCl in
which all occurrences of the transitive closure operator are positive (i.e., occur
under the scope of an even number of negations). In fact, one can always
convert such a formula into an equivalent formula in which no trcl operator
would be contained in the scope of any negation symbol. We then prove two
results.
Proposition 10.22. Over ordered structures, posTrCl captures NLog.
Proposition 10.23. Over ordered structures, posTrCl = TrCl.
Clearly, Theorem 10.21 will follow from these. Furthermore, they yield the
following corollary.
Corollary 10.24 (Immerman–Szelepcs´enyi). NLog is closed under
complementation.
This is in sharp contrast to other nondeterministic classes such as NP or
the levels Σp
i of the polynomial hierarchy, where closure under complementation
remains a major unsolved problem. In particular, for NP this is the
problem of whether NP = coNP.
We start by showing how to prove Proposition 10.22. With negation gone,
this proof becomes very similar to the other capture proofs seen in this and
the previous chapters. Indeed, the inclusion posTrCl ⊆ NLog is proved
by straightforward induction (since negation is only applied to FO formulae).
For the converse, suppose we have a nondeterministic logspace machine M. In
such a machine, we have one read-only tape that stores the input, enc(A), and
one work tape, whose size is bounded by c log n for some constant c (where
n =|A|). Let Q be the set of states. To model a conﬁguration of M, we need
to model both tapes. The input tape can be described by a tuple of variables
p, where p indicates a position on the tape, just as in the proof of Fagin’s and
Immerman-Vardi’s theorems.
10.6 Transitive Closure Logic 201
For the work tape, we need to describe its content, and the position of
the head, together with the state. The latter (position and the state) can be
described with | Q | variables (assuming c log n is shorter than the encoding
of structures with an n-element universe). If the alphabet of the work tape is
{0, 1}, there are 2c log n
= nc
possible conﬁgurations, which can be described
with c variables. Hence, the entire conﬁguration can be described by tuples s
of length at most c(σ)+|Q|+c, where c(σ) is a constant depending on σ that
gives an upper bound on the size of tuples p describing positions in the input.
Then the class of structures accepted by M is deﬁnable by the formula
∃s0∃s1 ϕinit(s0) ∧ ϕﬁnal(s1) ∧ [trclx,yϕnext(x, y)](s0, s1) . (10.15)
Here ϕinit(s0) says that s0 is the initial conﬁguration, with the input tape
head pointing at the ﬁrst position in the initial state, and the work tape
containing all zeros; ϕﬁnal(s1) says that s1 is an accepting conﬁguration (it is
in an accepting state), and ϕnext(x, y) says that the conﬁguration y is obtained
from the conﬁguration x in one move. It is a straightforward (but somewhat
tedious) task to write these three formulae in FO, and it is done similarly to
the proofs of other capture theorems. This proves Proposition 10.22.
Before we prove Proposition 10.23, we re-examine (10.15). Let min and
max, as before, stand for the constants for the minimal and the maximal
element with respect to the ordering, and let min and max stand for the
tuples of these constants, of the same length as the conﬁguration description.
Suppose instead of ϕnext(x, y) we use the formula ϕ′
next:
ϕnext(x, y) ∨ (x = min ∧ ϕinit(y)) ∨ (ϕﬁnal(x) ∧ y = max),
allowing jumps from min to the initial conﬁguration, and from any ﬁnal
conﬁguration to max. Then (10.15) is equivalent to
[trclx,yϕ′
next(x, y)](min, max). (10.16)
Thus, every posTrCl formula over ordered structures deﬁnes an NLog property,
which can be expressed by (10.15), and hence by (10.16). We therefore
obtained the following.
Corollary 10.25. Over ordered structures, every posTrCl formula is equivalent
to a formula of the form [trclx,yϕ](min, max), where ϕ is FO.
We now prove Proposition 10.23. The proof is by induction on the structure
of TrCl formulae, and the only nontrivial case is that of negation. By
Corollary 10.25, we may assume that negation is applied to a formula of the
form (10.16); that is, we have to show that
¬ [trclx,yϕ(x, y)](min, max), (10.17)
202 10 Fixed Point Logics and Complexity Classes
where ϕ is FO, is equivalent to a posTrCl formula.
Assume x = k. For an arbitrary formula α(x, y) with | y |= k, and a
structure A, let dA
α (a, b) be the shortest distance between a and b in α(A)
(viewed as a graph on Ak
). If no path between a and b exists, we assume
dA
α (a, b) = ∞. We deﬁne
ReachA
α (a) = {b ∈ Ak
| dA
α (a, b) = ∞}.
Thus, (10.17) holds in A iﬀ
|ReachA
ϕ (min)| = |ReachA
ϕ(x,y)∧¬(y=max)(min)| . (10.18)
Notice that the maximal ﬁnite value of dA
α (a, b) is |A|
k
. Since structures are
ordered, we can count up to |A|
k
using (k +1)-tuples of variables: associating
the universe A with {0, . . . , n−1}, we let a (k+1)-tuple (c1, . . . , ck+1) represent
c1 · nk
+ c2 · nk−1
+ . . . + ck · n + ck+1. (10.19)
As it will not cause any confusion, we shall use the notation c for both the
tuple and the number (10.19) it represents. Note also that constants 0 = min
and 1, as well as successor and predecessor c + 1 and c − 1, are FO-deﬁnable
in the presence of order, so we shall use them in formulae. Also notice that
the maximum value of dA
α (a, b), |A|k
, is represented by 10 = (1, 0, . . . , 0).
One useful property of posTrCl is that over ordered structures it can
count: for a formula β(x) of posTrCl, one can construct another posTrCl
formula countβ(y) such that A |= countβ(c) if there are at least c tuples a
in β(A). Indeed, we can enumerate all the tuples a, and go over all of them,
checking if β(a) holds. Since β can be checked in NLog, the whole algorithm
has NLog complexity, and thus is deﬁnable in posTrCl. One can also express
this counting directly: if ψ(x1v1, x2v2) is (x2 = x1 + 1) ∧ (v2 = v1) ∨ (x2 =
x1 + 1) ∧ β(x2) ∧ (v2 = v1 + 1) , then
∃z
trclx1v1,x2v2
ψ(x1v1, x2v2) min, min, max, z
∧ y = z ∨ β(min) ∧ (y = z + 1)
expresses countβ(y) (exercise: explain why).
Our next goal is to prove the following lemma.
Lemma 10.26. For every FO formula α(x, y), there exists a posTrCl formula
ρα(x, z) such that for every A,
A |= ρα(a, c) iﬀ |ReachA
α (a)|= c.
Before proving this, notice that Lemma 10.26 immediately implies Proposition
10.23, since by (10.18), (10.17) is equivalent to
10.6 Transitive Closure Logic 203
∃z ρϕ(min, z) ∧ ρϕ(x,y)∧¬(y=max)(min, z) ,
which is a posTrCl formula.
Let rA
α (a, c) denote the cardinality of {b | dA
α (a, b) ≤ c}, so that the
cardinality of the set ReachA
α (a) is rA
α (a, 10).
Assume that there is a formula γα(x, v, z1, z2) such that A |= γα(a, e, c1, c2)
means that if rA
α (a, e) = c1, then rA
α (a, e + 1) = c2. With such a formula γα,
ρα(x, z) is deﬁnable by
[trclv1z1,v2z2
(v2 = v1 + 1) ∧ γα(x, v1, z1, z2) ] min, min, 10, z),
since the above formula says that rα(x, 10) = z. Thus, it remains to show how
to deﬁne γα.
In preparation for writing down the formula γα, notice that there is a
posTrCl formula dα(x, y, z) such that A |= dα(a, b, c) iﬀ dA
α (a, b) ≤ c. Indeed,
it is given by
[trclx1z1,x2z2
α(x1, x2) ∧ (z1 < z2) ] x, min, y, z).
Coming back to γα, notice that rA
α (a, e + 1) = c2 iﬀ
c2 + |{b | dA
α (a, b) > e + 1}| = 10 (= nk
).
Hence, if we could write a posTrCl formula expressing this condition, we
would be able to express γα in posTrCl.
Suppose we can express dA
α (a, b) > e+1 in posTrCl. Then γα is straightforward
to write, since we already saw how to count: we start with c2 and
increment the count every time b with dA
α (a, b) > e + 1 is found; then trcl
is applied to see if 10 is reached (we leave the details of this formula to the
reader).
Thus, our last task is to express the condition dA
α (a, b) > e+1 in posTrCl.
Even though we have a formula dα(x, y, z) in posTrCl (meaning dα(x, y) ≤
z), what we need now is the negation of such a formula, which is not in
posTrCl. However, it is possible to express dA
α (a, b) > e + 1 in posTrCl
under the condition rA
α (a, e) = c1 (which is all that we need anyway, by the
deﬁnition of γα).
If e = min, then dA
α (a, b) > 1 is equivalent to ¬α(a, b). Otherwise,
dA
α (a, b) > e + 1 iﬀ one can ﬁnd c tuples f diﬀerent from b such that
dA
α (a, f) ≤ e and ¬α(f, b) for all such f. Now the distance formula (which
itself is a posTrCl formula) occurs positively, and to express dA
α (a, b) > e+1,
we simply count the number of f satisfying the conditions above, and compare
that number with c. As we have seen earlier, such counting of f’s can
be done by a posTrCl formula. Thus, γα is expressible in posTrCl, which
completes the proof of Lemma 10.26 and Theorem 10.21.
204 10 Fixed Point Logics and Complexity Classes
10.7 A Logic for Ptime?
We have seen that LFP and IFP capture Ptime on the class of ordered structures.
On the other hand, for classes such as NP and coNP we have logics
that capture them over all structures. The question that immediately arises is
whether there is a logic that captures Ptime, without the additional restriction
to ordered structures. If there were such a logic, answering the “Ptime
vs. NP” question would become a purely logical problem: one would have to
separate two logics over the class of all ﬁnite structures.
However, all attempts to produce a logic that captures Ptime have failed
so far. In fact, it is even conjectured that no such logic exists:
Conjecture (Gurevich) There is no logic that captures Ptime over the
class of all ﬁnite structures.
This is a very strong conjecture: since there is a logic for NP, by Fagin’s
theorem, it would imply that Ptime = NP! The conjecture described precisely
what a logic is. We shall not go into technical details, but the main idea is
to rule out the possibility of taking an arbitrary collection of properties and
stating that they constitute a logic. For example, is the collection of all Ptime
properties a logic? If we want the conjecture to hold, clearly the answer ought
to be no.
In this short section, we shall present a few attempts to refute Gurevich’s
conjecture and ﬁnd a logic for Ptime – and show how they all failed. The
results here will be presented without proofs, and the interested reader should
consult the bibliographic notes section for the references.
What are examples of properties not expressible in LFP or IFP over unordered
structures? Although we have not proved this yet, we mentioned one
example: the query even. We shall see later, in Chap. 11, that in general
IFP cannot express nontrivial counting properties over unordered structures.
Hence, one might try to add counting to IFP (it is better to use IFP, so
that positiveness would not constrain us), and hope that such an extension
captures Ptime.
This extension of IFP, denoted by IFP(Cnt), can be deﬁned in the same
way as we deﬁned FO(Cnt) from FO: one introduces the additional universe
{0, . . . , n − 1}, where n is the cardinality of the universe of a σ-structure A,
and extends the logic with counting quantiﬁers ∃ix. However, this extension
still falls short of Ptime, and the separating example is very complicated.
Theorem 10.27. There are Ptime properties which are not deﬁnable in
IFP(Cnt).
Another attempt to expand IFP is to introduce generalized quantiﬁers,
already seen in Chap. 8. There, we only dealt with unary generalized quantiﬁers;
here we present a general deﬁnition, but for notational simplicity deal
with the case of one additional relation per quantiﬁer.
10.7 A Logic for Ptime? 205
Let R be a relation symbol of arity k, R ∈ σ. Let C ⊆ STRUCT[{R}] be a
class of structures closed under isomorphism. This gives rise to a generalized
quantiﬁer QC and the extension of IFP with QC, denoted by IFP(QC), which
is deﬁned as follows. If ϕ(x, y) is an IFP(QC) formula of vocabulary σ, and
|x|= k, then
ψ(y) ≡ QCx ϕ(x, y) (10.20)
is an IFP(QC) formula. The other formation rules are exactly the same as for
IFP. The semantics of (10.20) is as follows:
A |= ψ(b) ⇔ A, {a | A |= ϕ(a, b)} ∈ C.
For example, if C is the class of connected graphs, then the sentence
QCx, y E(x, y) simply tests if the input graph is connected.
If Q is a set of generalized quantiﬁers, then by IFP(Q) we mean the
extension of IFP with the formulae (10.20) for all the generalized quantiﬁers
in Q.
There is a “simple” way of getting a logic that captures Ptime: it is
IFP(Qp), where Qp is the collection of all Ptime properties. However, this is
cheating: we deﬁne the logic in terms of itself. But perhaps there is a nicely
behaving set Q of generalized quantiﬁers such that IFP(Q) captures Ptime.
The ﬁrst result, showing that such a class – if it exists – will be hard to
ﬁnd, says the following.
Proposition 10.28. Let Qn be a collection of generalized quantiﬁers of arity
at most n. There there exists a vocabulary σn such that over σn-structures,
IFP(Qn) fails to capture Ptime.
The reason this result is not completely satisfactory is that the arity of
relations in σn depends on n. For example, Proposition 10.28 says nothing
about the impossibility of capturing Ptime over graphs. And in fact there is
a collection Qgr of generalized binary quantiﬁers (i.e., of arity 2) such that
IFP(Qgr) expresses all the Ptime properties of graphs (why?). In fact, one
can even show that there is a single ternary generalized quantiﬁer Q3 such
that IFP(Q3) expresses all the Ptime properties of graphs (intuitively, it is
possible to code Qgr with one ternary generalized quantiﬁer), but Q3 itself is
not Ptime-computable, and hence IFP(Q3) fails to capture Ptime on graphs.
The existence of a generalized quantiﬁer Q3 raises an intriguing possibility
that for some ﬁnite collection Qﬁn of Ptime-computable generalized quantiﬁers,
IFP(Qﬁn) captures Ptime on unordered graphs. However, this attempt
to refute Gurevich’s conjecture does not work either.
Theorem 10.29. There is no ﬁnite collection Qﬁn of Ptime-computable generalized
quantiﬁers such that IFP(Qﬁn) captures Ptime on unordered graphs.
Thus, given all that we know today, Gurevich’s conjecture may well be
true, as it has withstood a number of attempts to produce a logic for Ptime
over unordered structures.
206 10 Fixed Point Logics and Complexity Classes
10.8 Bibliographic Notes
Inductive operators and ﬁxed point logics are studied extensively in
Moschovakis [185] in the context of arbitrary models. The systematic study
of ﬁxed point logics in ﬁnite model theory originated with Chandra and Harel
[33], who introduced the least ﬁxed point operator in the context of database
query languages to overcome well-known limitations of FO. The subject is
treated in detail in Ebbinghaus and Flum [60], Immerman [133], Grohe [106];
see also a recent survey by Dawar and Gurevich [51]. All of these references
present the Tarski-Knaster theorem, least and inﬂationary ﬁxed point logics,
and simultaneous ﬁxed points.
The “even simple path” example is taken from Kolaitis [148], where it is
attributed to Yannakakis. See also Exercise 10.2.
The stage comparison theorem was proved in Moschovakis [185], and specialized
for the ﬁnite case in Immerman [130] and Gurevich and Shelah [119];
the proof presented here follows Leivant [165]. Corollary 10.12 is from Gurevich
and Shelah [119], and Corollary 10.13 from [130].
The connection between ﬁxed point logics and polynomial time was discovered
by several people in the early 1980s. Sazonov [212] showed in 1980 that a
certain least ﬁxed point construction – of recursive-theoretic ﬂavor – captures
Ptime. Then, in 1982, Immerman [129], Vardi [244], and Livchak [172] proved
what is now known as the Immerman-Vardi theorem. Both Immerman’s and
Vardi’s papers appeared in the proceedings of the STOC 1982 conference;
Livchak’s paper was published in Russian and became known much later;
hence Theorem 10.14 is usually referred to as the Immerman-Vardi theorem.
In 1986, Immerman published a full version of his 1982 paper (see [130]).
Theorem 10.15 is from Vardi [244].
Datalog has been studied extensively in the database literature, see, e.g.,
Abiteboul, Hull, and Vianu [3] for many additional results and references.
Theorem 10.19 is from Papadimitriou [194].
Theorem 10.21 is from Immerman [130, 132]: the ﬁrst of these papers
showed that posTrCl captures NLog, and the other paper proved closure
under complementation (see also Szelepcs´enyi [226]).
A number of references discuss Gurevich’s conjecture in detail (e.g., Otto
[191], Kolaitis [147], as well as [60]); they also discuss the notion of a “logic”
suitable for capturing Ptime. Theorem 10.27 is from Cai, F¨urer, and Immerman
[30] (see also Otto [191], as well as Gire and Hoang [91] for extensions).
Theorem 10.29 is from Dawar and Hella [52].
Sources for exercises:
Exercise 10.10: Ajtai and Gurevich [13]
Exercise 10.11: Immerman [130]
Exercises 10.12 and 10.13: Gr¨adel [97]
Exercise 10.14: Immerman [131]
Exercise 10.15: Gr¨adel and McColm [101]
10.9 Exercises 207
Exercise 10.16: Abiteboul and Vianu [5]
Exercises 10.17 and 10.18: Afrati, Cosmadakis, and Yannakakis [8]
Exercise 10.19: Gr¨adel and Otto [102]
Exercises 10.20 and 10.21: Grohe [107]
Exercise 10.22: Shmueli [220] and Cosmadakis et al. [43]
Exercise 10.23: Marcinkowski [179]
Exercise 10.24: Gottlob and Koch [94]
Exercise 10.25: Gurevich, Immerman, and Shelah [118]
Exercise 10.26: Dawar and Hella [52]
Exercise 10.27: Dawar, Lindell, and Weinstein [54]
10.9 Exercises
Exercise 10.1. Prove Proposition 10.3.
Exercise 10.2. Prove that the simultaneous ﬁxed point shown before Theorem 10.8
deﬁnes pairs of nodes connected by a simple path of even length.
Hint: use Menger’s theorem in graph theory.
Also show that this does not generalize to directed graphs.
Exercise 10.3. Prove Theorem 10.8 for a system involving an arbitrary number of
formulae.
Exercise 10.4. Prove Theorem 10.10.
Exercise 10.5. Prove Theorem 10.15.
Exercise 10.6. Prove Theorem 10.19.
Exercise 10.7. Prove that the combined complexity of LFP is Exptime-complete.
Exercise 10.8. Consider an alternative semantics for Datalog programs. Given a
set of rules Π and a structure A, an instantiation P of all the intensional predicates
is called a model of Π on A if every rule of Π is satisﬁed. Show that for any Π,
there exists a minimal, with respect to inclusion, model Pmin. The minimal model
semantics of Datalog deﬁnes the answer to (Π, Q) on A as the interpretation of Q
in Pmin.
Prove that the ﬁxed point and the minimal model semantics of Datalog coin-
cide.
Exercise 10.9. Write down the formulae ψi and ψq from the proof of the
Immerman-Vardi theorem, and show that their simultaneous least ﬁxed point computes
the relations Ti and Hq.
Exercise 10.10. Show that over ﬁnite structures, monotone and positive are two
diﬀerent concepts (they are known to be the same over inﬁnite structures, see Lyndon
[175]). That is, give an example of an FO formula ϕ(P, ·) which is monotone in P,
but not equivalent to any FO formula positive in P.
208 10 Fixed Point Logics and Complexity Classes
Exercise 10.11. Assume that the vocabulary σ contains at least two distinct constants.
Prove a stronger normal form result for LFP: every LFP formula is equivalent
to a formula of the form [lfpR,xϕ(R, x)](t), where ϕ is an FO formula.
Hint: use two constants to eliminate nested ﬁxed points.
Exercise 10.12. Consider a restriction of SO that consists of formulae of the form
QR1 . . . QRn∀x
^
l
αl,
where each Q is either ∃ or ∀, and each αl is Horn with respect to R1, . . . , Rn. That
is, it is of the form
γ1 ∧ . . . ∧ γm → β,
where each γj either does not mention Ri’s, or is of the form Ri(u), and β is either
of the form Ri(u), or false. We denote such restriction by SO-Horn. If all the
quantiﬁers Q are existential, we speak of ∃SO-Horn.
Prove that over ordered structures, SO-Horn and ∃SO-Horn capture Ptime.
Exercise 10.13. The class SO-Krom is deﬁned similarly to SO-Horn, except that
each αl is a disjunction of at most two atoms of the form Ri(u) or ¬Rj (u), and
a formula that does not mention the Ri’s. ∃SO-Krom is deﬁned as the restriction
where all second-order quantiﬁers are existential.
Prove that both SO-Krom and ∃SO-Krom capture NLog over ordered struc-
tures.
Exercise 10.14. Deﬁne a variant of the transitive closure logic, denoted by
DetTrCl, where the transitive closure operator trcl is replaced by the deterministic
transitive closure. When applied to a graph V, E , it ﬁnds pairs (a, b) which are
connected by a deterministic path: on such a path, every node except b must be of
out-degree 1.
Prove that DetTrCl captures DLog over ordered structures.
Exercise 10.15. Prove that over unordered structures, DetTrCl TrCl LFP.
Exercise 10.16. Consider the following language that computes queries over
STRUCT[σ]. Given an input structure A, its programs compute sequences of relations,
and are deﬁned inductively as follows:
• ∅ is a program that computes no relation.
• If Π(R1, . . . , Rn) is a program that computes relations R1, . . . , Rn, where
R1, . . . , Rn ∈ σ, then
Π(R1, . . . , Rn); R(x) :– ϕ(x);
where R ∈ σ ∪ {R1, . . . , Rn}, and ϕ is an FO formula in the vocabulary of σ
expanded with R1, . . . , Rn, is a program that computes relations R1, . . . , Rn, R,
with R obtained by evaluating ϕ on the expansion of A with R1, . . . , Rn, R.
• If Π(R1, . . . , Rn) is a program that computes relations R1, . . . , Rn, and
Π′
(T1, . . . , Tk) is a program over STRUCT[σ ∪ {R1, . . . , Rn} ∪ {S1, . . . , Sk}],
where the arity of each Si matches the arity of Ti, then
Π(R1, . . . , Rn); while change do Π′
(T1, . . . , Tk) end;
10.9 Exercises 209
is a program that computes (R1, . . . , Rn, T1, . . . , Tk) over σ-structures. The
meaning of the last statement is that starting with (∅, . . . , ∅) as the interpretation
of the Si’s, one iterates Π′
; it computes the Ti’s, which are then reused as
Si’s, and so on. This is done as long as it changes one relation among the Si’s.
If this program terminates, the values of the relations (T1, . . . , Tk) in that state
become the output.
For example, the while loop
while change do T(x, y) :– E(x, y) ∨ ∃z (E(x, z) ∧ S(z, y)) end;
computes the transitive closure of E.
Prove that over ordered structures, such while programs compute precisely the
Pspace queries.
Exercise 10.17. Let monotone Ptime be the class of all monotone Ptime properties.
Show that Datalog, even in the presence of a successor relation, fails to
capture monotone Ptime.
Hint: Let σ = {R, S}, where R is ternary, and S is unary. The separating query
is deﬁned as follows: Q is true in A iﬀ the system of linear equations
{x1 + x2 + x3 = 1 | (x1, x2, x3) ∈ RA
} ∪ {x = 0 | x ∈ SA
}
does not have a non-negative solution.
Exercise 10.18. Prove that without the successor relation, Datalog¬ fails to capture
Ptime on ordered structures, even if one allows atoms ¬(x = y).
Hint: The separating query takes a graph, and outputs pairs of nodes (a, b) such
that there is a path from a to b whose length is a perfect square.
Exercise 10.19. Show how to expand Datalog with counting, and prove that the
resulting language is equivalent to the expansion of IFP with counting.
Exercise 10.20. Prove that the expansion of IFP with counting captures Ptime
on the class of planar graphs.
Exercise 10.21. Prove that the class of planar graphs is deﬁnable in IFP.
Exercise 10.22. You may recall that containment of conjunctive queries is NPcomplete
(Exercise 6.19). Prove that containment of arbitrary Datalog queries is
undecidable, but becomes decidable if all intensional predicates are unary.
Exercise 10.23. We say that a Datalog program Π is uniformly bounded if there
is a number n such that on every structure A, the ﬁxed point of FΠ is reached after
at most n steps.
Prove that uniform boundedness is undecidable for Datalog, even for programs
that consist of a single rule.
Exercise 10.24. Consider trees represented as in Chap. 7, i.e., structures with two
successor predicates, labeling predicates, and, furthermore, assume that we have
unary predicates Leaf and Root interpreted as the set of leaves, and the singleton
set containing the root.
210 10 Fixed Point Logics and Complexity Classes
Deﬁne monadic Datalog as the restriction of Datalog where all intensional
predicates are unary.
Prove that over trees, Boolean and unary queries deﬁnable in monadic Datalog
and in MSO are precisely the same. In particular, a tree language is deﬁnable in
monadic Datalog iﬀ it is regular.
Exercise 10.25. Prove that there exists a class C of graphs which admits ﬁxed
points of unbounded depth (i.e., for every n there is an inductive operator that
reaches its ﬁxed point on some graph from C in at least n iterations), and yet
LFP = FO on C.
Remark: this exercise says that it is possible for LFP and FO to coincide on a
class of graphs which admits ﬁxed points of unbounded depth. The negation of this
was known as McColm’s conjecture; hence the goal of this exercise is to disprove
McColm’s conjecture. McColm [181] made two conjectures relating boundedness of
ﬁxed points and collapse of logics; the second conjecture that talks about FO and
the ﬁnite variable logic is known to be true (see Exercise 11.19).
For the next three exercises, consider the following statement, known as the
ordered conjecture (see Kolaitis and Vardi [153]):
If C is an inﬁnite class of ﬁnite ordered structures, then FO LFP on C.
Exercise 10.26. Prove that if the ordered conjecture does not hold, then Ptime =
Pspace.
Exercise 10.27. Prove that if the ordered conjecture holds, then Linh = Etime.
Here Linh is the linear time hierarchy: the class of languages computed in linear
time by alternating Turing machines, with a constant number of alternations, and
Etime is the class of languages computed by deterministic Turing machines in time
2O(n)
.
Exercise 10.28.∗
Does the ordered conjecture hold?
11
Finite Variable Logics
In this chapter, we introduce ﬁnite variable logics: a unifying tool for studying
ﬁxed point logics. These logics use inﬁnitary connectives already seen in
Chap. 8, but here we impose a diﬀerent restriction: each formula can use only
ﬁnitely many variables. We show that ﬁxed point logics LFP, IFP, and PFP
can be embedded in such a ﬁnite variable logic. Furthermore, the ﬁnite variable
logic is easier to study: it can be characterized by games, and this gives us
bounds on the expressive power of ﬁxed point logics; in particular, we show
that without a linear ordering, they fail to capture complexity classes. We
then study deﬁnability and ordering of types in ﬁnite variable logics, and use
these techniques to relate separating complexity classes to separating some
ﬁxed point logics over unordered structures.
11.1 Logics with Finitely Many Variables
Let us revisit the example of the transitive closure of a relation. Suppose E
is a binary relation. We know how to write FO formulae ϕn(x, y) stating that
there is a path from x to y of length n (that is, formulae deﬁning the stages
of the ﬁxed point computation of the transitive closure). One can express
ϕn(x, y), n > 1, as ∃x1 . . . ∃xn−1 E(x, x1)∧. . .∧E(xn−1, y) , and ϕ1(x, y) as
E(x, y). If we could use inﬁnitary disjunctions (i.e., the logic L∞ω of Chap. 8),
we could express the transitive closure query by
n≥1
ϕn(x, y). (11.1)
One could even deﬁne ϕn(x, y) by induction, as we did in Chap. 10:
ϕ1(x, y) ≡ E(x, y), ϕn+1(x, y) ≡ ∃zn E(x, zn) ∧ ϕn(zn, y) , (11.2)
where zn is a fresh variable. The problem with either deﬁnition of the ϕn’s
together with (11.1) is that the logic L∞ω is useless in the context of ﬁnite
212 11 Finite Variable Logics
model theory: as we saw in Chap. 8, it deﬁnes every property of ﬁnite structures
(Proposition 8.4).
However, if we look carefully at the deﬁnition of the ϕn’s given in (11.2),
we can see that there is no need to introduce a fresh variable zn for each new
formula. In fact, we can deﬁne formulae ϕn as follows:
ϕ1(x, y) ≡ E(x, y)
. . . . . . . . .
ϕn+1(x, y) ≡ ∃z E(x, z) ∧ ∃x z = x ∧ ϕn(x, y) .
(11.3)
In deﬁnition (11.3), each formula ϕn uses only three variables, x, y, and z, by
carefully reusing them. To deﬁne ϕn(x, y), we need to say that there is a z
such that E(x, z) holds, and ϕn(z, y) holds. But with three variables, we only
know how to say that ϕn(x, y) holds. So once z is used in E(x, z), it is no
longer needed, and we replace it by x: that is, we say that there is an x such
that x happens to be equal to z, and ϕn(x, y) holds: and we know that the
latter is deﬁnable with three variables.
With these formulae (11.3), we can still deﬁne the transitive closure by
(11.1). What makes the diﬀerence now is the fact that the resulting formula
only uses three variables. If one checks the proof of Proposition 8.4, one discovers
that, to deﬁne arbitrary classes of ﬁnite structures in L∞ω, one needs,
in general, inﬁnitely many variables. So perhaps an inﬁnitary logic in which
the number of variables is ﬁnite could be useful after all?
The answer to this question is a resounding yes: we shall see that all
ﬁxed point logics can be coded in a way very similar to (11.3), and that the
resulting inﬁnitary logic can be analyzed by the same techniques we have seen
in previous chapters.
Deﬁnition 11.1 (Finite variable logics). The class of FO formulae that
use at most k distinct variables will be denoted by FOk
. The class of L∞ω
formulae that use at most k variables will be denoted by Lk
∞ω (reminder: L∞ω
extends FO with inﬁnitary conjunctions and disjunctions ). Finally, we
deﬁne the ﬁnite variable inﬁnitary logic Lω
∞ω by
Lω
∞ω =
k∈N
Lk
∞ω.
That is, Lω
∞ω has formulae of L∞ω that only use ﬁnitely many variables.
The quantiﬁer rank qr(·) of Lω
∞ω formulae is deﬁned as for FO for Boolean
connectives and quantiﬁers; for inﬁnitary connectives, we deﬁne
qr(
i
ϕi) = qr(
i
ϕi) = sup
i
qr(ϕi).
Thus, in general the quantiﬁer rank of an inﬁnitary formula is an ordinal.
For example, if the ϕn’s are FO formulae with qr(ϕn) = n, then
11.1 Logics with Finitely Many Variables 213
qr( n<ω ϕn) = ω, and qr(∃x n<ω ϕn) = ω + 1. When we establish a normal
form for Lω
∞ω, we shall see that over ﬁnite structures it suﬃces to consider
only formulae of quantiﬁer rank up to ω.
Let us give a few examples of deﬁnability in Lω
∞ω. We ﬁrst consider linear
orderings: that is, the vocabulary contains one binary relation <. With the
same trick of reusing variables, we deﬁne the formulae
ψ1(x) ≡ (x = x)
. . . . . . . . .
ψn+1(x) ≡ ∃y (x > y) ∧ ∃x y = x ∧ ψn(x) .
(11.4)
The formula ψn(a) is true in a linear order L iﬀ the set {b | b ≤ a} contains
at least n elements. Indeed, ψ1(x) is true for every x, and ψn+1(x) says that
there is y < x such that there are at least n elements that do not exceed
y. Thus, for each n we have a sentence Ψn ≡ ∃x ψn(x) that is true in L iﬀ
|L|≥ n.
Now let C be an arbitrary subset of N. Consider the sentence
n∈C
Ψn ∧ ¬Ψn+1 .
This is a sentence of L2
∞ω, as it uses only two variables, x and y, and it is
true in L iﬀ |L|∈ C. Hence, arbitrary cardinalities of linear orderings can be
tested in L2
∞ω.
Next, consider ﬁxed point computations. Suppose that an FO formula
ϕ(R, x) deﬁnes an inductive operator; that is, either ϕ is monotone in R,
or we are considering an inﬂationary ﬁxed point. We have seen in Chap. 10
that stages of the ﬁxed point computation can be deﬁned by FO formulae
ϕn
(x); the formulae we used, however, may potentially involve arbitrarily
many variables. To be able to express the least ﬁxed point as n ϕn
(x), we
need to deﬁne those formulae ϕn
(x) more carefully.
Assume that ϕ, in addition to x = (x1, . . . , xk), uses variables z1, . . . , zl.
We introduce additional variables y = (y1, . . . , yk), and deﬁne ϕ0
(x) as
¬(x1 = x1) (i.e., false), and then inductively ϕn+1
(x) as ϕ(R, x) in which
every occurrence of R(u1, . . . , uk), where u1, . . . , uk are variables among x
and z, is replaced by
∃y (y = u) ∧ ∃x((x = y) ∧ ϕn
(x)) . (11.5)
As usual, x = y is an abbreviation for (x1 = y1)∧. . .∧(xk = yk) . Notice that
in the resulting formula, variables from y cannot appear in any subformula of
the form R(·).
The eﬀect of the substitution is that we use ϕ with R being given the
interpretation of the nth stage, so n ϕn
(x) does compute the ﬁxed point.
214 11 Finite Variable Logics
Furthermore, we at most doubled the number of variables in ϕ. Hence, if
ϕ ∈ FOm
, then both lfpR,xϕ and ifpR,xϕ are expressible in L2m
∞ω.
If we have a complex ﬁxed point formula (e.g., involving nested ﬁxed
points), we can then apply the construction inductively, using the same substitution
(11.5), since ϕn
need not be an FO formula, and can have inﬁnitary
connectives. This shows that every LFP or IFP formula is equivalent to a
formula of Lω
∞ω (since for every ﬁxed point, we at most double the number of
variables). Hence, we have the following.
Theorem 11.2. LFP, IFP, PFP ⊆ Lω
∞ω.
Proof. We have proved it already for LFP and IFP; for PFP, the construction
is modiﬁed slightly: instead of taking the disjunction of all the ϕn
’s, we deﬁne
the sentence goodn as ∀x ϕn
(x) ↔ ϕn+1
(x) (indicating that the ﬁxed point
was reached). Then [pfpR,xϕ](y) is expressed by
ψ(y) ≡
n∈N
goodn ∧ ϕn
(x) .
Indeed, if there is no n such that goodn holds, then the partial ﬁxed point
is the empty set, and ψ(y) is equivalent to false. Otherwise, let n0 be the
smallest natural number n for which goodn holds. Then, for all m ≥ n0, we
have ∀x ϕn0
(x) ↔ ϕm
(x) , and hence ψ(y) deﬁnes the partial ﬁxed point.
Therefore, ψ deﬁnes pfpR,xϕ, and it at most doubles the number of variables.
Using this construction inductively, we see that PFP ⊆ Lω
∞ω.
We now revisit the case of orderings. We have shown before that arbitrary
cardinalities of linear orderings are deﬁnable in Lω
∞ω; in other words, every
query on ﬁnite linear orderings is Lω
∞ω-deﬁnable. It turns out that this extends
to all ordered structures.
Proposition 11.3. Every query over ordered ﬁnite σ-structures is expressible
in Lω
∞ω. In fact, if m is the maximum arity of a relation symbol in σ, then it
suﬃces to use Lm+1
∞ω .
Proof. To keep the notation simple, we consider ordered graphs G = V, E ,
with a linear ordering < on V (i.e., m = 2, and in this case we show deﬁnability
in L3
∞ω). Recall that we have an L2
∞ω formula ψn(x), that uses variables x, y,
and tests if there are at least n elements in V which do not exceed x in the
ordering <. Hence, for each n we have an L2
∞ω formula ψ=n(x) which holds iﬀ
x is the nth element in the ordering <. Now, for each G we deﬁne a formula
χG as
∀x∀z E(x, z) ↔
(i,j)∈E
ψ=i(x) ∧ ψ=j(z) ∧ ∃x ψp(x) ∧ ¬∃x ψp+1(x),
viewing the universe V of cardinality p as {1, . . . , p}. Here ψ=j(z) is obtained
from ψ=j(x) by replacing x by z; that is, this formula uses variables z and y.
11.2 Pebble Games 215
Note that χG ∈ L3
∞ω and G′
|= χG iﬀ G′
is isomorphic to G (as an ordered
graph). Finally, for a class P of ordered graphs, we let
ΦP ≡
G∈P
χG.
Clearly, this formula deﬁnes P.
11.2 Pebble Games
In this section we present Ehrenfeucht-Fra¨ıss´e-style games which characterize
ﬁnite variable logics. There are two elements of these games that we have not
seen before. First, these are pebble games: the spoiler and the duplicator have
a ﬁxed set of pairs of pebbles, and each move consists of placing a pebble on
an element of a structure, or removing a pebble and placing it on another
element. Second, the game does not have to end in a ﬁnite number of rounds
(but we can still determine who wins it).
Deﬁnition 11.4 (Pebble games). Let A, B ∈ STRUCT[σ]. A k-pebble
game over A and B is played by the spoiler and the duplicator as follows.
The players have a set of pairs of pebbles {(p1
A, p1
B), . . . , (pk
A, pk
B)}. In each
move, the following happens:
• The spoiler chooses a structure, A or B, and a number 1 ≤ i ≤ k.
For the description of the other moves, we assume the spoiler has
chosen A. The other case, when the spoiler chooses B, is completely sym-
metric.
• The spoiler places the pebble pi
A on some element of A. If pi
A was already
placed on A, this means that the spoiler either leaves it there or removes
it and places it on some other element of A; if pi
A was not used, it means
that the spoiler picks that pebble and places it on an element of A.
• The duplicator responds by placing pi
B on some element of B.
We denote the game that continues for n rounds by PGn
k (A, B), and the
game that continues forever by PG∞
k (A, B).
After each round of the game, the pebbles placed on A and B deﬁne a
relation F ⊆ A × B: if pi
A, for some i ≤ k, is placed on a ∈ A and pi
B is
placed on b ∈ B, then the pair (a, b) is in F.
The duplicator has a winning strategy in PGn
k (A, B) if he can ensure that
after each round j ≤ n, the relation F deﬁnes a partial isomorphism. That is,
F is a graph of a partial isomorphism. In this case we write A ≡∞ω
k,n B.
The duplicator has a winning strategy in PG∞
k (A, B) if he can ensure that
after every round the relation F deﬁnes a partial isomorphism. This is denoted
by A ≡∞ω
k B.
216 11 Finite Variable Logics
L4L5 L5 L5 L5L4 L4 L4
(a) (b) (c) (d)
Fig. 11.1. Spoiler winning the pebble game on L5 and L4
These games characterize ﬁnite variable logics as follows.
Theorem 11.5. a) Two structures A, B ∈ STRUCT[σ] agree on all sentences
of Lk
∞ω of quantiﬁer rank up to n iﬀ A ≡∞ω
k,n B.
b) Two structures A, B ∈ STRUCT[σ] agree on all sentences of Lk
∞ω iﬀ
A ≡∞ω
k B.
Before we prove this theorem, we give a few examples of pebble games.
First, consider two arbitrary linear orderings Ln, Lm of lengths n and m,
n = m. Here we show that it is the spoiler who wins PG∞
2 (Ln, Lm).
The strategy for L5 and L4 is shown in Fig. 11.1; the general strategy
is exactly the same. We have two pairs of pebbles, and elements pebbled by
pebble 1 are shown as circled, and those pebbled by pebble 2 are shown in
dashed boxes. The spoiler starts by placing pebble 1 on the top element of
L5; the duplicator is forced to respond by placing the matching pebble on the
top element of L4. Then the spoiler places the second pebble on the second
element of L5, and the duplicator matches it in L4 (if he does not, he loses in
the next round).
This is the conﬁguration shown in Fig. 11.1 (a). Next, the spoiler removes
pebble 1 from the top element of L5 and places it on the third element. The
spoiler is forced to mimic the move in L4, to preserve the order relation. We
are now in the position shown in Fig. 11.1 (b). The spoiler then moves the
second pebble two levels down; the duplicator matches it. We are now in
position (c). At this point the spoiler places pebble 1 on the last element of
L5, and the duplicator has no place for the matching pebble, and thus he loses
in the position shown in Fig. 11.1 (d).
Note that we could not have expected any other result here, since we know
that all queries over ﬁnite linear orderings are expressible in L2
∞ω; hence, the
duplicator should not be able to win PG∞
2 (Ln, Lm) unless n = m.
11.2 Pebble Games 217
As another example, consider structures of the empty vocabulary: that
is, just sets. We claim the following: if |A|, |B| ≥ k, then the duplicator wins
PG∞
k (A, B); in other words, A ≡∞ω
k B. Indeed, the strategy for the duplicator
is very similar to his strategy in the Ehrenfeucht-Fra¨ıss´e game: at all times, he
has to maintain the condition that pi
A and pj
A are placed on the same element
iﬀ pi
B and pj
B are placed on the same element. Since both sets have at least k
elements, this condition is easily maintained, and the duplicator can win the
inﬁnite game. This gives us the following.
Corollary 11.6. The query even is not expressible in Lω
∞ω.
Proof. Assume, to the contrary, that even is expressible by a sentence Φ of
Lω
∞ω. Let k be such that Φ ∈ Lk
∞ω. Choose two sets A and B of cardinalities k
and k + 1, respectively. By the above, A ≡∞ω
k B and hence A |= Φ iﬀ B |= Φ.
This, however, contradicts the assumption that Φ deﬁnes even.
From Corollary 11.6, we derive a result mentioned, but not proved, in
Chap. 10.
Corollary 11.7. • LFP (LFP+<)inv.
• IFP (IFP+<)inv.
• PFP (PFP+<)inv.
Proof. Since LFP, IFP, PFP ⊆ Lω
∞ω, none of them deﬁnes even; however,
over ordered structures these logics capture Ptime and Pspace, and hence
can deﬁne even.
Before proving Theorem 11.5, we make two additional observations. First,
consider an inﬁnitary disjunction ϕ ≡ i∈I ϕi, where all ϕi are FO formulae,
and assume that qr(ϕ) ≤ n. This means that qr(ϕi) ≤ n for all i ∈ I. We
know that, up to logical equivalence, there are only ﬁnitely many diﬀerent FO
formulae of quantiﬁer rank n. Hence, there is a ﬁnite subset I0 ⊂ I such that
ϕ is equivalent to i∈I0
ϕi; that is, to an FO formula. Using this argument
inductively on the structure of Lω
∞ω formulae, we conclude that for every k,
every Lk
∞ω formula of quantiﬁer rank n is equivalent to an FOk
formula of
the same quantiﬁer rank. Hence, if A and B agree on all FOk
sentences of
quantiﬁer rank at most n, then A ≡∞ω
k,n B.
Now assume that A and B agree on all FOk
sentences. That is, for every
n, we have A ≡∞ω
k,n B. Since A and B are ﬁnite, so is the number of diﬀerent
maps from Ak
to Bk
, and hence every inﬁnite strategy in PG∞
k (A, B) is
completely determined by a ﬁnite strategy for suﬃciently large n: the one in
which all (ﬁnitely many) possible conﬁgurations of the game appeared. Thus,
for suﬃciently large n (that depends on A and B), winning PGn
k (A, B) implies
winning PG∞
k (A, B). We therefore obtain the following.
Proposition 11.8. For every two structures A, B, the following are equiva-
lent:
218 11 Finite Variable Logics
1. A and B agree on all FOk
sentences, and
2. A and B agree on all Lk
∞ω sentences.
The second observation is about formulae with free variables. We write
(A, a) ≡∞ω
k,n (B, b) (or (A, a) ≡∞ω
k (B, b)), where | a | = | b | = m ≤ k, if the
duplicator wins the game PGn
k (A, B) (or PG∞
k (A, B)) from the position where
the ﬁrst m pebbles have been placed on the elements of a and b respectively.
A slight modiﬁcation of the proof of Theorem 11.5 shows the following.
Corollary 11.9. Given two structures, A, B, and a ∈ Am
, b ∈ Bm
, m ≤ k,
a) (A, a) ≡∞ω
k,n (B, b) iﬀ for every ϕ(x) ∈ Lk
∞ω with qr(ϕ) ≤ n, it is the case
that A |= ϕ(a) ⇔ B |= ϕ(b).
b) (A, a) ≡∞ω
k (B, b) iﬀ for every ϕ(x) ∈ Lk
∞ω, it is the case that
A |= ϕ(a) ⇔ B |= ϕ(b).
We are now ready to prove Theorem 11.5. As with the Ehrenfeucht-Fra¨ıss´e
theorem, we shall use a certain back-and-forth property in the proof. We start
with a few deﬁnitions.
Given a partial map f : A → B, its domain and range will be denoted by
dom(f) and rng(f); that is, f is deﬁned on dom(f) ⊆ A, and f(dom(f)) =
rng(f) ⊆ B.
We let symbols α and β range over ﬁnite and inﬁnite ordinals. Given two
structures A and B and an ordinal β, let Iβ be a set of partial isomorphisms
between A and B, and let Iα = {Iβ | β < α}. We say that Iα has the
k-back-and-forth property if the following conditions hold:
• Every set Iβ is nonempty.
• Iβ′ ⊆ Iβ for β < β′
.
• Each Iβ is downward-closed: if g ∈ Iβ and f ⊆ g (i.e., dom(f) ⊆ dom(g),
and f and g coincide on dom(f)), then f ∈ Iβ.
• If f ∈ Iβ+1 and |dom(f)| < k, then
forth: for every a ∈ A, there is g ∈ Iβ such that f ⊆ g and a ∈ dom(g);
back: for every b ∈ B, there is g ∈ Iβ such that f ⊆ g and b ∈ rng(g).
As before, games are nothing but a reformulation of the back-and-forth
property. Indeed, for a ﬁnite α, having a family Iα with the k-back-and-forth
property is equivalent to A ≡∞ω
k,α−1 B: the collection Iβ simply consists of
conﬁgurations from which the duplicator wins with β moves remaining. This
also suﬃces for inﬁnitely long games: as we remarked earlier, for every two
ﬁnite structures A and B, and for some n, depending on A and B, it is the
case that A ≡∞ω
k,n B implies A ≡∞ω
k B. Furthermore, if we have a suﬃciently
11.2 Pebble Games 219
long ﬁnite chain Iα, some Iβ’s will be repeated, as there are only ﬁnitely
many partial isomorphisms between A and B. Hence, such a chain can then
be extended to arbitrary ordinal length.
Therefore, it will be suﬃcient to establish equivalence between indistinguishability
in Lk
∞ω and the existence of a family of partial isomorphisms with
the k-back-and-forth property. This is done in the following lemma.
Lemma 11.10. Given two structures A and B, they agree on all sentences of
Lk
∞ω of quantiﬁer rank < α iﬀ there is a family Iα = {Iβ | β < α} of partial
isomorphisms between A and B with the k-back-and-forth property.
In the rest of the section, we prove Lemma 11.10. Suppose A and B agree
on all sentences of Lk
∞ω of quantiﬁer rank < α. Let β < α. Deﬁne Iβ as the
set of partial isomorphisms f with |dom(f)|≤ k such that for every ϕ ∈ Lk
∞ω
with qr(ϕ) ≤ β, and every a contained in dom(f),
A |= ϕ(a) ⇔ B |= ϕ(f(a)).
We show that Iα = {Iβ | β < α} has the k-back-and-forth property.
Since A and B agree on all sentences of Lk
∞ω of quantiﬁer rank < α, each Iβ
is nonempty as it contains the empty partial isomorphism. The containment
Iβ′ ⊆ Iβ for β < β′
is immediate from the deﬁnition, as is downward-closure.
Thus, it remains to prove the back-and-forth property.
Assume, to the contrary, that we found f ∈ Iβ+1, with β+1 < α, such that
|dom(f)| = m < k, and f violates the forth condition. That is, there exists
a ∈ A such that there is no g ∈ Iβ extending f with a ∈ dom(g). In this case,
by the deﬁnition of Iβ, for every b ∈ B we can ﬁnd a formula ϕb(x0, x1, . . . , xm)
of quantiﬁer rank at most β such that for some a1, . . . , am ∈ dom(f), we have
A |= ϕb(a, a1, . . . , am) and B |= ¬ϕb(b, f(a1), . . . , f(am)).
Now let
ϕ(x1, . . . , xm) ≡ ∃x0
b∈B
ϕb(x0, x1, . . . , xm).
Clearly, A |= ϕ(a1, . . . , am), but B |= ¬ϕ(f(a1), . . . , f(am)), which contradicts
our assumption f ∈ Iβ+1 (since qr(ϕ) ≤ β + 1). The case when f violates the
back condition is handled similarly.
For the other direction, assume that we have a family Iα with the k-backand-forth
property. We use (transﬁnite) induction on β to show that for every
ϕ(x1, . . . , xm) ∈ Lk
∞ω, m ≤ k, with qr(ϕ) ≤ β < α,
for every f ∈ Iβ, a1, . . . , am ∈ dom(f) :
A |= ϕ(a1, . . . , am) ⇔ B |= ϕ(f(a1), . . . , f(am)).
(11.6)
Clearly, (11.6) suﬃces, since it implies that A and B agree on Lk
∞ω sentences
of quantiﬁer rank < α.
220 11 Finite Variable Logics
The basis case is β = 0. Then ϕ is a Boolean combination of atomic
formulae (for ﬁnite quantiﬁer ranks, as we saw, inﬁnitary connectives are
superﬂuous), and hence (11.6) follows from the assumption that f is a partial
isomorphism.
We now use induction on the structure of ϕ. The case of Boolean combinations
is trivial. If ϕ ≡ i ϕi and qr(ϕ) > qr(ϕi) for all i, then β is a limit
ordinal and again (11.6) for ϕ easily follows by applying the hypothesis to all
the ϕi’s of smaller quantiﬁer rank.
Thus, it remains to consider the case of ϕ(x1, . . . , xm) ≡
∃x0 ψ(x0, . . . , xm), with qr(ϕ) = β + 1 and qr(ψ) = β for some β with
β + 1 < α. We can assume without loss of generality that x0 is not among
x1, . . . , xm (exercise: why?) and hence m < k.
Let f ∈ Iβ+1 and a1, . . . , am ∈ dom(f). Assume that A |= ϕ(a1, . . . , am);
that is, for some a0 ∈ A, A |= ψ(a0, a1, . . . , am). Since Iβ+1 is downwardclosed,
we can further assume that dom(f) = {a1, . . . , am}. Since |dom(f)| =
m < k, by the k-back-and-forth property we ﬁnd g ∈ Iβ extending f
such that a0 ∈ dom(g). Applying (11.6) inductively to ψ, we derive B |=
ψ(g(a0), g(a1), . . . , g(am)). That is, B |= ψ(g(a0), f(a1), . . . , f(am)) since f
and g agree on a1, . . . , am. Hence, B |= ϕ(f(a1), . . . , f(am)).
The other direction, that B |= ϕ(f(a1), . . . , f(am)) implies A |=
ϕ(a1, . . . , am), is completely symmetric. This ﬁnishes the proof of (11.6),
Lemma 11.10, and Theorem 11.5.
11.3 Deﬁnability of Types
For logics like FO and MSO, we have used rank-k types, which are collections
of all formulae of quantiﬁer rank k that hold in a given structure. An extremely
useful feature of types is that they can be deﬁned by formulae of quantiﬁer
rank k, and we have used this fact many times.
When we move to ﬁnite variable logics, the role of parameter k is played
by the number of variables rather than the quantiﬁer rank. We can, therefore,
deﬁne, FOk
-types, but then it is not immediately clear if every such type is
itself deﬁnable in FOk
. In this section we prove that this is the case. As with
the case of FO or MSO types, this deﬁnability result proves very useful, and
we derive some interesting corollaries. In particular, we establish a normal
form for Lk
∞ω, and prove that every class of ﬁnite structures that is closed
under ≡∞ω
k is deﬁnable in Lk
∞ω.
Deﬁnition 11.11 (FOk
-types). Given a structure A and a tuple a, the
FOk
-type of (A, a) is
tpFOk (A, a) = {ϕ(x) ∈ FOk
| A |= ϕ(a)}.
An FOk
-type is any set of formulae of FOk
of the form tpFOk (A, a).
11.3 Deﬁnability of Types 221
One could have deﬁned Lk
∞ω-types as well, as the set of all Lk
∞ω formulae
that hold in (A, a). This, however, would be unnecessary, since every FOk
type
completely determines the Lk
∞ω-type: this follows from Proposition 11.8
stating that two structures agree on all Lk
∞ω formulae iﬀ they agree on all
FOk
formulae.
Note that unlike in the cases of FO and MSO, the number of diﬀerent
FOk
-types need not be ﬁnite, since we do not restrict the quantiﬁer rank. In
fact we saw in the example of ﬁnite linear orderings that there are inﬁnitely
many diﬀerent FO2
-types, since every ﬁnite cardinality of a linear ordering
can be characterized by an FO2
sentence.
Each FOk
-type τ is trivially deﬁnable in Lk
∞ω by ϕ∈τ ϕ. More interestingly,
we can show that FOk
-types are deﬁnable without inﬁnitary connectives.
Theorem 11.12. For every FOk
-type τ, there is an FOk
formula ϕτ (x) such
that, for every structure A,
tpFOk (A, a) = τ ⇔ A |= ϕτ (a).
Before we prove Theorem 11.12, let us state a few corollaries. First, restricting
our attention to sentences, we obtain the following.
Corollary 11.13. For every structure A, there is a sentence ΨA of FOk
such
that for any other structure B, we have B |= ΨA iﬀ A ≡∞ω
k B.
We know that without restrictions on the number of variables, we can
write a sentence that tests if B is isomorphic to A, and this is why the full
inﬁnitary logic deﬁnes every class of ﬁnite structures. Corollary 11.13 shows
that, rather than testing isomorphism as in the full inﬁnitary logic, in Lk
∞ω
one can write a sentence that tests ≡∞ω
k -equivalence.
We can also see that closure under ≡∞ω
k is suﬃcient for deﬁnability in
Lk
∞ω.
Corollary 11.14. If a class C of structures is closed under ≡∞ω
k (i.e., A ∈ C
and A ≡∞ω
k B imply B ∈ C), then C is deﬁnable in Lk
∞ω.
Proof. Let T be the collection of Lk
∞ω-types τ such that there is a structure
A in C with tpFOk (A) = τ. From closure under ≡∞ω
k it follows that τ∈T ϕτ
deﬁnes C.
Deﬁnability of Lk
∞ω-types also yields a normal form result, stating that
only countable disjunctions of FOk
formulae suﬃce.
Corollary 11.15. Every Lk
∞ω formula is equivalent to a single countable disjunction
of FOk
formulae.
222 11 Finite Variable Logics
Proof. Let ϕ(x) be an Lk
∞ω formula. Consider the set Cϕ = {(A, a) | A |=
ϕ(a)}, such that no two elements of Cϕ are isomorphic (this ensures that Cϕ
is countable, since there are only countably many isomorphism types of ﬁnite
structures). Let ϕA,a(x) be the FOk
formula deﬁning tpFOk (A, a). Let
ψ(x) ≡
(A,a)∈Cϕ
ϕ(A,a)(x).
We claim that ϕ and ψ are equivalent. Suppose B |= ϕ(b). Let (B′
, b′
) be
an isomorphic copy of (B, b) present in Cϕ. Then B′
|= ϕ(B′,b′)(b′
) and thus
B′
|= ψ(b′
) and B |= ψ(b). Conversely, if B |= ψ(b), then for some A and a
with A |= ϕ(a), we have tpFOk (A, a) = tpFOk (B, b); that is, (A, a) ≡∞ω
k (B, b).
Since ϕ is an Lk
∞ω formula, this implies B |= ϕ(b), showing that ϕ and ψ are
equivalent.
Since the negation of an Lk
∞ω formula is an Lk
∞ω formula, we obtain a dual
result.
Corollary 11.16. Every Lk
∞ω formula is equivalent to a single countable conjunction
of FOk
formulae.
We now present the proof of Theorem 11.12. To keep the notation simple,
we look at the case when there are no free variables; that is, we deal
with tpFOk (A). Another assumption that we make is that the vocabulary σ is
purely relational. Adding free variables and constant symbols poses no problem
(Exercise 11.1).
Fix a structure A, and let A≤k
be the set of all tuples of elements of A
of length up to k. For any a = (a1, . . . , al) ∈ A≤k
, where l ≤ k, we deﬁne a
formula ϕm
a (x1, . . . , xl). Intuitively, these formulae will have the property that
they precisely characterize what one can say about a in FOk
, with quantiﬁer
rank at most m: that is, B |= ϕm
a (b) iﬀ (A, a) and (B, b) agree on all the FOk
formulae of quantiﬁer rank up to m.
To deﬁne these formulae, consider partial functions h : {x1, . . . , xk} → A,
and ﬁrst deﬁne formulae ϕm
h (y), with free variables y being those in dom(h),
as follows:
• ϕ0
h(y) is the conjunction of all atomic and negated atomic formulae true
in A of h(y).
• To deﬁne ϕm+1
h (y), consider two cases:
1. Suppose |dom(h)|< k. Let i be the least index such that xi ∈ dom(h),
and ha be the extension of h deﬁned on dom(h) ∪ {xi} such that
ha(xi) = a. Then
ϕm+1
h (y) ≡ ϕm
h (y) ∧
a∈A
∃xi ϕm
ha
(y, xi) ∧ ∀xi
a∈A
ϕm
ha
(y, xi).
11.3 Deﬁnability of Types 223
2. Suppose |dom(h)| = k. Let hi be the restriction of h which is not
deﬁned only on xi. Then
ϕm+1
h (x) ≡ ϕm
h (x) ∧
k
i=1
ϕm+1
hi
(xi),
where xi is x with the variable xi excluded.
Finally, we deﬁne ϕm
a (x1, . . . , xl) as ϕm
h (x), where h is given by h(xi) = ai,
for i = 1, . . . , l.
To show that formulae ϕm
a do what they are supposed to do, we show
that if they hold, a certain sequence of sets of partial isomorphisms with the
k-back-and-forth property must exist.
Lemma 11.17. Let a = (a1, . . . , al) ∈ A≤k
. Then B |= ϕm
a (b) iﬀ there exists
a collection Im = {I0, I1, . . . , Im} of sets of partial isomorphism between A
and B with the k-back-and-forth property such that Im ⊆ Im−1 ⊆ . . . ⊆ I0,
and g = {(a1, b1), . . . , (al, bl)} ∈ Im.
Proof of Lemma 11.17. Since qr(ϕm
a ) = m and A |= ϕm
a (a), the existence of
Im implies, by Lemma 11.10, that B |= ϕm
a (b).
For the converse, we establish the existence of Im by induction on m.
If m = 0, we let I0 consist of all the restrictions of g. Clearly, I0 is not
empty, and since g is a partial isomorphism (because, by the assumption,
B |= ϕ0
a(b), and thus a and b satisfy the same atomic formulae), all elements
of I0 are partial isomorphisms.
For the induction step, to go from m to m + 1, we distinguish two cases.
Case 1: l < k. From B |= ϕm+1
a (b) and the deﬁnition of ϕm+1
a it follows
that B |= ϕm
a (b), and thus we have, by the induction hypothesis, a sequence
I′
m = {I′
0, . . . , I′
m} of partial isomorphisms with the k-back-and-forth property
such that g ∈ I′
m.
Looking at the second conjunct of ϕm+1
a and applying the induction hypothesis
for m, we see that for every a ∈ A there exists b ∈ B and a sequence
Ia
m = {Ia
0 , . . . , Ia
m} of partial isomorphisms with the k-back-and-forth property
such that ga,b = {(a1, b1), . . . , (al, bl), (a, b)} ∈ Ia
m.
We now deﬁne:
Ii = I′
i ∪
a∈A
Ia
i for i ≤ m
Im+1 = {f | f ⊆ g}.
It is easy to see that component-wise unions like this preserve the k-backand-forth
property. Furthermore, since g ∈ I′
m, then Im+1 ⊆ I′
m ⊆ Im. Thus,
we only have to check the k-back-and-forth property with respect to Im+1
and Im. But this is guaranteed by the second and the third conjunct of ϕm+1
a .
224 11 Finite Variable Logics
Indeed, consider g and a ∈ A − dom(g). Since B |= ϕm+1
a (b), by the second
conjunct we see that B |= ∃xϕm
aa(b, x) and hence for some b ∈ B, we have
B |= ϕm
aa(bb). But then g ∪ {(a, b)} ∈ I′
m ⊆ Im. The back property is proved
similarly. This completes the proof for case 1.
Case 2: l = k. By the deﬁnition of ϕm+1
a for the case of l = k, we see that
B |= ϕm
a (b), and hence by the induction hypothesis, g is a partial isomorphism.
For each i ≤ k, let gi be g without the pair (ai, bi). Applying the argument
for the case l < k to each gi, we get a sequence of partial isomorphisms
{Ii
0, . . . , Ii
m+1} with the k-back-and-forth property such that Ii
m+1 ⊆ . . . ⊆
Ii
0. Now we deﬁne
Ij = {g} ∪
k
i=1
Ii
j, j ≤ m + 1.
One can easily verify all the properties of a sequence of partial isomorphisms
with the k-back-and-forth property: in fact, all of the properties are preserved
under component-wise union, and since |dom(g)| = k, the k-back-and-forth
extension for g is not required. This completes the proof case 2 and Lemma
11.17.
For each a ∈ A≤k
, consider ϕm
a (A) = {a0 | A |= ϕm
a (a0)}. By deﬁnition,
ϕm+1
a is of the form ϕm
a ∧ . . ., and hence
ϕ0
a(A) ⊇ ϕ1
a(A) ⊇ . . . ⊇ ϕm
a (A) ⊇ ϕm+1
a (A) ⊇ . . . .
Since A is ﬁnite, this sequence eventually stabilizes. Let ma be the number
such that ϕma
a (A) = ϕm
a (A) for all m > ma. Then we deﬁne
M = max
a∈A≤k
ma, and
ΨA ≡ ϕM
ǫ ∧
a∈A≤k
∀x1 . . . ∀xk ϕM
a (x) → ϕM+1
a (x) . (11.7)
Here ǫ stands for the empty sequence. By the deﬁnition of M, A |= ΨA.
Furthermore, ΨA ∈ FOk
.
Thus, to conclude the proof, we show that ΨA deﬁnes tpFOk (A). In other
words, we need the following.
Lemma 11.18. If B is a ﬁnite structure, then B |= ΨA iﬀ tpFOk (A) =
tpFOk (B); that is, A ≡∞ω
k B.
Proof of Lemma 11.18. Since ΨA ∈ FOk
and A |= ΨA, it suﬃces to show that
A ≡∞ω
k B whenever B |= ΨA.
Let B |= ΨA. We deﬁne a set G of partial maps between A and B by
{(a1, b1), . . . , (al, bl)} ∈ G ⇔ B |= ϕM+1
(a1,...,al)(b1, . . . , bl).
11.4 Ordering of Types 225
Since B |= ΨA, the sentence ϕM+1
ǫ is true in B, and thus G is nonempty, as
the empty partial map is a member of G.
Applying Lemma 11.17 to each g = {(a1, b1), . . . , (al, bl)} ∈ G, we see that
there is a sequence Ig
= {Ig
0 , . . . , Ig
M+1} of partial isomorphisms with the
k-back-and-forth property such that Ig
0 ⊇ . . . ⊇ Ig
M+1 and g ∈ Ig
M+1. We
now deﬁne a family I = {Ii | i ∈ N} by
Ii =
g∈G
Ig
i for i ≤ M + 1
Ii = IM+1 for i > M + 1.
It remains to show that I has the k-back-and-forth property. As we have
seen in the proof of Lemma 11.17, the k-back-and-forth property is preserved
through component-wise union, and since all Ii, i > M + 1, are identical, it
suﬃces to prove that every partial isomorphism in IM+2 can be extended in
IM+1.
Fix f ∈ IM+2 such that |dom(f)| < k. We show the forth part; the back
part is identical. Let a ∈ A. Since f ∈ IM+2, and the sequence {I0, . . . , IM+1}
has the k-back-and-forth property, we can ﬁnd f′
∈ IM with f ⊆ f′
and
a ∈ dom(f′
). Let f′
= {(a1, b1), . . . , (al, bl)}. Since f′
is a partial isomorphism
from IM , from Lemma 11.17 we conclude that B |= ϕM
(a1,...,al)(b1, . . . , bl).
Now from the implication in (11.7), we see that B |= ϕM+1
(a1,...,al)(b1, . . . , bl);
therefore, f′
∈ G. But then f′
∈ If′
M+1 and hence f′
∈ IM+1, which proves
the forth part. Since the back part is symmetric, this concludes the proof of
Lemma 11.18 and Theorem 11.12.
11.4 Ordering of Types
In this section, we show that many interesting properties of types can be
expressed in LFP. In particular, consider the following equivalence relation
≈FOk on tuples of elements of a structure A:
a ≈FOk b ⇔ tpFOk (A, a) = tpFOk (A, b).
Clearly this relation is deﬁnable by an Lk
∞ω formula
ψ(x, y) ≡
τ
ϕτ (x) ∧ ϕτ (y) ,
where τ ranges over all FOk
-types.
It is more interesting, however, that this relation is deﬁnable in a weaker
logic LFP. Furthermore, it turns out that there is a formula of LFP that
deﬁnes a certain preorder ≺FOk on tuples, such that the equivalence relation
induced by this preorder is precisely ≈FOk . This means that on structures in
which all elements have diﬀerent FOk
-types, we can deﬁne a linear order in
226 11 Finite Variable Logics
LFP, and hence, by the Immerman-Vardi theorem, on such structures LFP
captures Ptime.
We start by showing how to deﬁne ≈FOk .
Proposition 11.19. Fix a vocabulary σ. For every k and l ≤ k, there is an
LFP formula η(x, y) in 2l free variables such that for every A ∈ STRUCT[σ],
A |= η(a, b) ⇔ a ≈FOk b.
Proof. The atomic FOk
-type of (A, a), with |a|= l ≤ k, is the conjunction of
all atomic and negated atomic formulae true of a in A. Since there are ﬁnitely
many atomic FOk
-formulae, up to logical equivalence, each atomic type is
deﬁnable by an FOk
formula. Let α1(x), . . . , αs(x) list all such formulae. Then
we deﬁne
ψ0(x, y) ≡
i,j≤s, i=j
αi(x) ∧ αj(y) .
This is a formula of quantiﬁer rank 0, and A |= ψ0(a, b) iﬀ the atomic FOk
types
of a and b are diﬀerent.
Next, we deﬁne a formula ψ in the vocabulary σ expanded with a 2l-ary
relation R:
ψ(R, x, y) ≡ ψ0(x, y) ∨
l
i=1
∃xi∀yiR(x, y) ∨
l
i=1
∃yi∀xiR(x, y), (11.8)
and let
ϕ(x, y) ≡ [lfpR,x,y ψ(R, x, y)](x, y).
Consider the ﬁxed point computation for ψ. Initially, we have tuples (a, b)
with diﬀerent atomic types; that is, tuples corresponding to the position in
the pebble game in which the spoiler wins. At the next stage, we get all the
positions of the pebble game (a, b) such that, in one move, the spoiler can force
the winning position. In general, the ith stage consists of positions from which
the spoiler can win the pebble game in i − 1 moves, and hence A |= ϕ(a, b)
iﬀ from the position (a, b), the spoiler can win the game. In other words,
A |= ϕ(a, b) iﬀ (A, a) ≡∞ω
k (A, b), or, equivalently, tpFOk (A, a) = tpFOk (A, b).
Hence, η can be deﬁned as ¬ϕ, which is an LFP formula.
We now extend this technique to deﬁne a preorder ≺FOk on tuples, whose
associated equivalence relation is precisely ≈FOk .
Suppose we have a set X partitioned into subsets X1, . . . , Xm. Consider a
binary relation ≺ on X given by
x ≺ y ⇔ x ∈ Xi, y ∈ Xj, and i < j.
We call relations obtained in such a way strict preorders. With each strict
preorder ≺ we associate an equivalence relation whose equivalence classes are
precisely X1, . . . , Xm. It can be deﬁned by the formula ¬(x ≺ y) ∧ ¬(y ≺ x).
11.4 Ordering of Types 227
Theorem 11.20. For every vocabulary σ, and every k, there exists an LFP
formula χ(x, y), with | x |=| y |= k, such that on every A ∈ STRUCT[σ], the
formula χ deﬁnes a strict preorder ≺FOk whose equivalence relation is ≈FOk .
As we mentioned before, this result becomes useful when one deals with
structures A such that for every a, b ∈ A, tpFOk (a) = tpFOk (b) whenever
a = b. Such structures are called k-rigid.
Theorem 11.20 tells us that in a k-rigid structure, there is an LFP-deﬁnable
strict preorder whose equivalence classes are of size 1: that is, a linear order.
Hence, from the Immerman-Vardi theorem we obtain:
Corollary 11.21. Over k-rigid structures, LFP captures Ptime.
Now we prove Theorem 11.20. We shall use the following notation. If a =
(a1, . . . , ak) is a tuple, then ai←a is the tuple in which ai was replaced by a,
i.e., (a1, . . . , ai−1, a, ai+1, . . . , ak).
Recall the formula ψ(x, y) (11.8). The ﬁxed point of this formula deﬁned
the complement of ≈FOk , and it follows from the proof of Proposition 11.19
that the jth stage of the ﬁxed point computation for ψ, ψj
(x, y), deﬁnes the
set of positions from which the spoiler wins with j − 1 moves remaining. In
other words, A |= ψj
(a, a) iﬀ (A, a) and (B, b) disagree on some FOk
formula
of quantiﬁer rank up to j − 1.
We now use this formula ψ to deﬁne a formula γ(S, x, y) such that the
jth stage of the inﬂationary ﬁxed point computation for γ deﬁnes a strict
preorder whose equivalence relation is the complement of the relation deﬁned
by ψj
(x, y). In other words, γj
(A) deﬁnes a relation ≺j on Ak
such that the
equivalence relation ∼j associated with this preorder is
a ∼j b ⇔ (A, a) ≡∞ω
k,j−1 (A, b).
We now explain the idea of the construction. In the beginning, we have to deal
with atomic FOk
-types. Since these can be explicitly deﬁned (see the proof of
Proposition 11.19), we can choose an arbitrary ordering on them.
Now, suppose we have deﬁned ≺j, the jth stage of the ﬁxed point computation
for γ, whose equivalence relation is the set of positions from which the
duplicator can play for j − 1 moves (i.e., the complement of the jth stage of
ψ). Let Y1, . . . , Ys be the equivalence classes.
We have to reﬁne ≺j to come up with a preorder ≺j+1. For that, we
have to order tuples (a, b) which were equivalent at the jth stage, but become
nonequivalent at stage j + 1. But these are precisely the tuples that get into
the ﬁxed point of ψ at stage j + 1.
Looking at the deﬁnition of ψ (11.8), we see that there are two ways for
ψj+1
(a, b) to be true (i.e., for (a, b) to get into the ﬁxed point at stage j + 1):
1. There is a ∈ A such that ϕj
(ai←a, bi←b) holds for every b ∈ A. In other
words, the equivalence class of ai←a contains no tuple of the form bi←b
which is diﬀerent from b.
228 11 Finite Variable Logics
2. Symmetrically, there is b ∈ A such that the equivalence class of bi←b
contains no tuple of the form ai←a = a.
Assume that i′
is the minimum number ≤ k such that either 1 or 2 above, or
both, happen. Let Y be the set of all the tuples ai′←a for case 1 and bi′←b for
case 2. We then consider the smallest, with respect to ≺j, equivalence class
Yp’s into which elements of Y may fall. Note that it is impossible that for
some a, b, both ai′←a and bi′←b are in Yp. Hence, either
1′
. for some a, ai′←a is in Yp, or
2′
. for some b, bi′←b is in Yp.
In case 1′
, we let a ≺j+1 b, and in case 2′
, we let b ≺j+1 a.
This is the algorithm; it remains to express it in LFP. The formula χ(x, y)
will be deﬁned as [ifpS,x,yγ(S, x, y)](x, y). To express γ, we ﬁrst deal with the
atomic case. Since we have an explicit listing α1, . . . , αs of formulae deﬁning
atomic types, we can use
γ0(x, y) ≡
i<j
αi(x) ∧ αj(y)
to order atomic types.
Next, we deﬁne
ξ′
i(x, y) ≡ ∀xi∃yi ¬S(x, y) ∧ ¬S(y, x) ∧ ∀yi∃xi ¬S(x, y) ∧ ¬S(y, x) ,
ξi(x, y) ≡
p<i
ξ′
p(x, y) ∧ ξ′
i(x, y).
The formula ξi(x, b) will be used to determine the position i′
in the algorithm.
To select tuples ai←a which are inequivalent to all tuples bi←b, we use the
formula
δ1
i (x, x, y) ≡ ∀y S(xi←x, yi←y) ∨ S(yi←y, xi←x) ,
and δ2
i (y, x, y) for the symmetric case (in which we reverse the roles of x and
y).
Finally, we get the following deﬁnition of γ(x, y):
γ0(x, y) ∨
¬S(y, x) ∧
l
i=1
ξi(x, y) ∧
∃x δ1
i (x, x, y) ∧ ∀y δ2
i (y, x, y) → S(xi←x, yi←y)
.
Notice that γ is not positive in S; however, by the Gurevich-Shelah theorem,
ifpS,x,yγ is equivalent to an LFP formula.
11.5 Canonical Structures and the Abiteboul-Vianu Theorem 229
We leave it to the reader to complete the proof: that is, to show that γ indeed
codes the algorithm described in the beginning of the proof, and to prove
by induction that the jth stage of the inﬂationary ﬁxed point computation
for γ deﬁnes a preorder whose equivalence relation is ≡∞ω
k,j−1.
11.5 Canonical Structures and the Abiteboul-Vianu
Theorem
Using deﬁnability of a linear ordering on FOk
-types, we show how to convert
each structure A into another structure Ck(A), which, in essence, captures
all the information about Lk
∞ω-deﬁnability over A. The main application of
this construction is the Abiteboul-Vianu theorem, which reduces the problem
of separating complexity classes Ptime and Pspace to separating two logics
over unordered structures (recall that Ptime and Pspace are captured by
LFP and PFP over structures with a linear ordering).
Fix k > 0, and a purely relational vocabulary σ = {R1, . . . , Rl} such
that the arity of each Ri is at most k (since we shall be dealing with FOk
formulae, we can impose this additional restriction without loss of generality).
We shall use the preorder relation ≺FOk deﬁned in the previous section; its
equivalence relation is a ≈FOk b given by tpFOk (A, a) = tpFOk (A, b), for a, b ∈
Ak
. Whenever k and A are clear from the context, we shall write [a] for the
≈FOk -equivalence class of a.
Deﬁnition 11.22. Given a vocabulary σ = {R1, . . . , Rl}, where the arities of
all the Ri’s do not exceed k, and a σ-structure A, we deﬁne a new vocabulary
ck(σ) and a structure Ck(A) ∈ STRUCT[ck(σ)] as follows.
Let t = kk
, and let π1, . . . , πt enumerate all the functions π : {1, . . ., k} →
{1, . . . , k}. Then
ck(σ) = {<, U, U1, . . . , Ul, S1, . . . , Sk, P1, . . . , Pt},
where <, the Si’s, and the Pj’s are binary, and U, U1, . . . , Ul are unary.
The universe of Ck(A) is Ak
/ ≈FOk , the set of ≈FOk -equivalence classes
of k-tuples from A. The interpretation of the predicates is as follows (where a
stands for (a1, . . . , ak)):
• < is interpreted as ≺FOk .
• U([a]) holds iﬀ a1 = a2.
• Ui([a]) holds iﬀ (a1, . . . , am) ∈ RA
i , where m ≤ k is the arity of Ri.
• Si([a], [b]) holds iﬀ a and b diﬀer at most in their ith component.
• Pπ contains pairs ([a], [(aπ(1), . . . , aπ(k))]) for all a ∈ Ak
.
Lemma 11.23. The structure Ck(A) is well-deﬁned, and < is interpreted as
a linear ordering on its universe.
230 11 Finite Variable Logics
Proof. Suppose U([a]) holds and b ∈ [a]. Then a1 = a2, and since tpFOk (a) =
tpFOk (b), we have b1 = b2. Since other predicates of Ck(A) are deﬁned in
terms of atomic formulae over A, they are likewise independent of particular
representatives of the equivalence classes. Finally, Theorem 11.20 implies that
< is a linear ordering on Ak
/ ≈FOk .
The structure Ck(A) can be viewed as a canonical structure in terms of
Lk
∞ω-deﬁnability.
Proposition 11.24. For every A, B ∈ STRUCT[σ],
A ≡∞ω
k B ⇔ Ck(A) ∼= Ck(B).
Proof sketch. Suppose A ≡∞ω
k B. Since every FOk
-type is deﬁnable by an FOk
formula, every type that is realized in A is realized in B. Hence, | A |=| B |.
Furthermore, since ≺FOk is deﬁnable by the same formula on all σ-structures,
we have an order-preserving map h : Ak
/ ≈FOk → Bk
/ ≈FOk . It is easy to
verify that such a map is an isomorphism between Ck(A) and Ck(B).
For the converse, one can use the isomorphism h : Ck(A) → Ck(B) together
with relations Si to establish a winning strategy for the duplicator in the kpebble
game. Details are left as an easy exercise for the reader.
We next show how to translate formulae of LFP and PFP over Ck(A) to
formulae over A, and vice versa. We assume, as throughout most of Chap. 10,
that ﬁxed point formulae do not have parameters.
Lemma 11.25. 1. For every LFP or PFP formula ϕ(x) over vocabulary σ
that uses at most k variables, there is an LFP (respectively, PFP) formula
ϕ◦
over vocabulary ck(σ) in one free variable such that
A |= ϕ(a) ⇔ Ck(A) |= ϕ◦
([a]). (11.9)
2. For every LFP or PFP formula ϕ(x1, . . . , xm) in the language of ck(σ),
there is an LFP (respectively, PFP) formula ϕ∗
(y) over vocabulary σ in
km free variables such that
Ck(A) |= ϕ([a1], . . . , [am]) ⇔ A |= ϕ∗
(a1, . . . , am).
Before proving Lemma 11.25, we present its main application.
Theorem 11.26 (Abiteboul-Vianu). Ptime = Pspace iﬀ LFP = PFP.
Proof. Suppose Ptime = Pspace. Let ϕ be a PFP formula, and let it use k
variables. By Lemma 11.25 (1), we have a PFP formula ϕ◦
over ck(σ). Since
ϕ◦
is in PFP, it is computable in Pspace, and thus, by the assumption, in
Ptime. Since ϕ◦
is deﬁned over ordered structures of the vocabulary ck(σ), by
the Immerman-Vardi theorem it is deﬁnable in LFP over ck(σ), by a formula
ψ(x). Now applying Lemma 11.25 (2), we get an LFP formula ψ∗
(x) over
vocabulary σ which is equivalent to ϕ. Hence, LFP = PFP.
For the other direction, if LFP = PFP, then LFP+ < = PFP+ <, and
hence Ptime = Pspace.
11.5 Canonical Structures and the Abiteboul-Vianu Theorem 231
Corollary 11.27. The following are equivalent:
• LFP = PFP;
• LFP+< = PFP+<;
• Ptime = Pspace.
Notice that this picture diﬀers drastically from what we have seen for logics
capturing DLog, NLog, and Ptime: while the exact relationships between
DetTrCl+ < = DLog, TrCl+ < = NLog, and LFP+ < = Ptime are
not known, we do know that
DetTrCl TrCl LFP.
However, for the case of LFP and PFP, we cannot even conclude LFP PFP
without resolving the Ptime vs. Pspace question.
We now prove Lemma 11.25. As the ﬁrst step, we prove part 1 for the
case of ϕ being an FOk
formula. Note that in general, x may have fewer
than k variables. However, in this proof we shall treat any such formula as
deﬁning a k-ary relation; that is, ϕ(xj1 , . . . , xjs ) deﬁnes the relation ϕ(A) =
{(a1, . . . , ak) | A |= ϕ(aj1 , . . . , ajs )}, and when we write A |= ϕ(a), we
actually mean that a ∈ Ak
and a ∈ ϕ(A).
Using this convention, we deﬁne ϕ◦
by induction on the structure of the
formula:
• If ϕ is xi = xj, then choose π so that π(1) = i, π(2) = j, and let ϕ◦
(x) ≡
∃y Pπ(x, y) ∧ U(y) .
• If ϕ is an atomic formula of the form Ri(xj1 , . . . , xjs ), choose π so that
π(1) = j1, . . . , π(s) = js, and let ϕ◦
(x) ≡ ∃y Pπ(x, y) ∧ Ui(y) .
• (¬ϕ)◦
≡ ¬ϕ◦
.
• (ϕ1 ∨ ϕ2)◦
≡ ϕ◦
1 ∨ ϕ◦
2.
• If ϕ is ∃xiψ(x), then ϕ◦
(x) ≡ ∃y Si(x, y) ∧ ψ◦
(y) .
It is routine to verify, by induction on formulae, that the above translation
guarantees (11.9). For example, if ϕ is xi = xj, then A |= ϕ(a) implies
that ai = aj, and hence Ck(A) |= Pπ([a], [b]) for π(i) = 1, π(j) = 2, and
b = (ai, aj, . . .). Since Ck(A) |= U([b]), we conclude that Ck(A) |= ϕ◦
([a]).
Conversely, if Ck(A) |= Pπ([a], [b]) ∧ U([b]) for π as above and some b, we conclude
that there is c ∈ [a] with ci = cj. Since tpFOk (a) = tpFOk (c), it follows
that ai = aj and A |= ϕ(a). The other basis case is similar.
For the induction step, the only nontrivial case is that of ϕ being ∃xiψ(x).
If A |= ϕ(a), then for some ai that diﬀers from a in at most the ith position we
have A |= ψ(ai), and hence by the induction hypothesis, Ck(A) |= Si([a], [ai])∧
ψ◦
([ai]) and, therefore, Ck(A) |= ϕ◦
([a]). Conversely, assume that for some b,
232 11 Finite Variable Logics
Ck(A) |= Si([a], [b])∧ψ◦
([b]). Then we can ﬁnd a0 ≈FOk a and b0 ≈FOk b such
that a0 and b0 diﬀer in at most the ith position. Consider the k-pebble game
on (A, a0) and (A, a). Suppose that in one move the spoiler goes from (A, a0)
to (A, b0). Since the duplicator can play from position (a0, a), he can respond
to this move and ﬁnd b′
such that (A, b0) ≡∞ω
k (A, b′
). Hence, b′
∈ [b], and it
diﬀers from a in at most the ith position. Since [b′
] = [b], by the induction
hypothesis we conclude that A |= ψ(b′
), which witnesses A |= ϕ(a). This
concludes the proof of (11.9) for FOk
formulae.
Furthermore, (11.9) is preserved if we expand the vocabulary by an extra
relation symbol R, with a corresponding R′
added to ck(σ), and interpret R as
a relation closed under ≡∞ω
k . Since we know that all the stages of lfp and pfp
operators deﬁne such relations (see Exercise 11.6), we conclude that (11.9)
holds for LFP and PFP formulae.
The proof of part 2 of Lemma 11.25 is by straightforward induction on
the formulae, using the fact that ≺FOk is deﬁnable in LFP (Theorem 11.20).
Details are left to the reader as an exercise.
11.6 Bibliographic Notes
Inﬁnitary logics have been studied extensively in model theory, see, e.g., Barwise
and Feferman [18]. The ﬁnite variable logic was introduced by Barwise
[17], who also deﬁned the notion of a family of partial isomorphisms with the
k-back-and-forth property. Pebble games were introduced by Immerman [128]
and Poizat [200]. Kolaitis and Vardi [152, 153] studied many aspects of ﬁnite
variable logics; in particular, they showed that it subsumes ﬁxed point logics,
and proved normal forms for Lk
∞ω.
A systematic study of ﬁnite variable logics was undertaken by Dawar,
Lindell, and Weinstein [53], and our presentation here is based on that paper.
In particular, deﬁnability of FOk
-types in FOk
is from [53], as well as the
deﬁnition of a linear ordering on FOk
-types.
Theorem 11.26 is due to Abiteboul and Vianu [6], but the presentation
here is based on the model-theoretic approach of [53] rather than the more
computational approach of [6]. The approach of [6] is based on relational complexity.
Relational complexity classes are deﬁned using machines that compute
directly on structures rather than on their encodings as strings. Abiteboul
and Vianu [6] and Abiteboul, Vardi, and Vianu [4] establish a tight connection
between ﬁxed point logics and relational complexity classes, and show
that questions about containments among standard complexity classes can
be translated to questions about containments among relational complexity
classes.
Otto’s book [191] is a good source for information on ﬁnite variable logics
over ﬁnite models.
11.7 Exercises 233
Sources for exercises:
Exercises 11.6 and 11.7: Dawar, Lindell, and Weinstein [53]
Exercises 11.8 and 11.9: Dawar [49]
Exercise 11.10: de Rougemont [56]
Exercise 11.11: Dawar, Lindell, and Weinstein [53]
Exercise 11.12: Lindell [171]
Exercise 11.13: Grohe [108]
Exercises 11.14 and 11.15: Dawar, Lindell, and Weinstein [53]
Exercises 11.16 and 11.17: Kolaitis and Vardi [154]
Exercise 11.18: Grohe [110]
Exercise 11.19: McColm [181]
Kolaitis and Vardi [153]
11.7 Exercises
Exercise 11.1. Extend the proof of Theorem 11.12 to handle free variables, and
constants in the vocabulary.
Exercise 11.2. Fill in the details at the end of the proof of Theorem 11.20.
Exercise 11.3. Complete the proof of Proposition 11.24.
Exercise 11.4. Complete the proof of Lemma 11.25, part 2.
Exercise 11.5. Prove that the FOk
hierarchy is strict: there are properties expressible
in FOk+1
which are not expressible in FOk
.
Exercise 11.6. The goal of this exercise is to ﬁnd a tight (as far as the number of
variables is concerned) embedding of ﬁxed point logics into Lω
∞ω. Let LFPk
, IFPk
,
and PFPk
stand for restrictions of LFP, IFP, and PFP to formulae that use at
most k distinct variables (we assume that ﬁxed point formulae have no parameters).
Prove that LFPk
, IFPk
, PFPk
⊆ Lk
∞ω.
Hint: Let ϕ(R, x) be a formula, and let ϕi
(x) deﬁne the ith stage of a ﬁxed point
computation. Show by induction on i that the query deﬁned by ϕi
is closed under
≡∞ω
k , and use Corollary 11.14.
Exercise 11.7. Prove that if A and B agree on all FOk
sentences of quantiﬁer rank
up to nk
+ k + 1 and |A|≤ n, then A ≡∞ω
k B.
Exercise 11.8. Consider the complete bipartite graph Kn,m. Show that Kk,k ≡∞ω
k
Kk,k+1 for every k. Also show that Kn,m is Hamiltonian iﬀ n = m. Conclude that
Hamiltonicity is not Lω
∞ω-deﬁnable.
Exercise 11.9. Prove that 3-colorability is not Lω
∞ω-deﬁnable.
Exercise 11.10. Let In be a graph with n isolated vertices and Cm an undirected
cycle of length m. For two graphs G1 = V1, E1 and G2 = V2, E2 with V1 and V2
disjoint, let G1 × G2 be the graph whose nodes are V1 ∪ V2, and the edges include
E1, E2, as well as all the edges (v1, v2) for v1 ∈ V1, v2 ∈ V2. Prove that for a graph
of the form In × Cm, it is impossible to test, in Lω
∞ω, if n = m. Use this result to
give another proof (cf. Exercise 11.8) that Hamiltonicity is not Lω
∞ω-deﬁnable.
234 11 Finite Variable Logics
Exercise 11.11. A binary tree is balanced if all the leaves are at the same distance
from the root. Prove that L4
∞ω deﬁnes a Boolean query Q on graphs such that if
Q(G) is true, then G is a balanced binary tree.
Exercise 11.12. Prove that there is a Ptime query on balanced binary trees which
is not LFP-deﬁnable.
Conclude that LFP Lω
∞ω ∩ Ptime.
Exercise 11.13. Prove that the following problems are Ptime-complete for each
ﬁxed k.
• Given two σ-structures A and B, is it the case that A ≡∞ω
k B?
• Given a σ-structure A and a, b ∈ Ak
, are tpFOk (A, a) and tpFOk (A, b) the same?
Exercise 11.14. Prove that if A is a ﬁnite rigid structure (i.e., a structure that has
no nontrivial automorphisms), then there is a number k such that A is k-rigid.
Exercise 11.15. Prove that the structure Ck(A) can be constructed in polynomial
time.
Exercise 11.16. Deﬁne ∃Lk
∞ω as the fragment of Lk
∞ω that contains all atomic
formulae and is closed under inﬁnitary conjunctions and disjunctions, and existential
quantiﬁcation. Let
∃Lω
∞ω =
[
k
∃Lk
∞ω.
Prove that Datalog ⊆ ∃Lω
∞ω.
Exercise 11.17. Consider the following modiﬁcation of the k-pebble game. For two
structures A and B, the spoiler always plays in A and the duplicator always responds
in B. The spoiler wins if at some point, the position (a, b) does not deﬁne a partial
homomorphism (as opposed to a partial isomorphism in the standard game). The
duplicator wins (which is denoted by A ⊳∞ω
k B) if the spoiler does not win; that is,
if after each round the position deﬁnes a partial homomorphism.
Prove that the following are equivalent:
• A ⊳∞ω
k B.
• If Φ ∈ ∃Lk
∞ω and A |= Φ, then B |= Φ.
Exercise 11.18. By an FOk
theory we mean a maximally consistent set of FOk
sentences. Deﬁne the k-size of an FOk
theory T as the number of diﬀerent FOk
types
realized by ﬁnite models of T. Prove that there is no recursive bound on the
size of the smallest model of an FOk
theory in terms of its k-size. That is, for every
k there is a vocabulary σk such that is no recursive function f with the property
that every FOk
theory T in vocabulary σk has a model of size at most f(n), where
n is the k-size of T.
Exercise 11.19. Let C be a class of σ-structures. We call it bounded if for every
relation symbol R ∈ σ, there exists a number n such that every FO formula ϕ(R, x)
positive in R reaches its least ﬁxed point on any structure in C in at most n iterations.
Prove that the following are equivalent:
• C is bounded;
• Lω
∞ω collapses to FO on C.
Exercise 11.20.∗
Is the FOk
hierarchy strict over ordered structures? That is, are
there properties which, over ordered structures, are deﬁnable in FOk+1
but not in
FOk
, for arbitrary k?
12
Zero-One Laws
In this chapter we show that properties expressible in many logics are almost
surely true or almost surely false; that is, either they hold for almost all
structures, or they fail for almost all structures. This phenomenon is known
as the zero-one law. We prove it for FO, ﬁxed point logics, and Lω
∞ω. We
shall also see that the “almost everywhere” behavior of logics is drastically
diﬀerent from their “everywhere” behavior. For example, while satisﬁability
in the ﬁnite is undecidable, it is decidable if a sentence is true in almost all
ﬁnite models.
12.1 Asymptotic Probabilities and Zero-One Laws
To talk about asymptotic probabilities of properties of ﬁnite models, we
adopt the convention that the universe of a structure A with |A| = n will
be {0, . . ., n − 1}. Let us start by considering the case of undirected graphs.
By Grn we denote the set of all graphs with the universe {0, . . ., n − 1}. The
number of undirected graphs on {0, . . . , n − 1} is
|Grn | = 2(n
2).
Let P be a property of graphs. We deﬁne
µn(P) =
|{G ∈ Grn | G has P}|
|Grn |
.
That is, µn(P) is the probability that a randomly chosen graph on the set of
nodes {0, . . ., n−1} has P. Randomly here means with respect to the uniform
distribution: each graph is equally likely to be chosen.
We then deﬁne the asymptotic probability of P as
µ(P) = lim
n→∞
µn(P), (12.1)
236 12 Zero-One Laws
if the limit exists. If P is expressed by a sentence Φ of some logic, then we
refer to µn(Φ) and µ(Φ).
In general, we can deal with arbitrary σ-structures. In that case, we
can deﬁne sn
σ as the number of diﬀerent σ-structures with the universe
{0, . . . , n − 1}, and sn
σ(P) as the number of diﬀerent σ-structures with the
universe {0, . . ., n − 1} that have the property P, and let
µn(P) =
sn
σ(P)
sn
σ
.
Then the asymptotic probability µ(P) is deﬁned again by (12.1).
We now consider a few examples:
• Let P be the property “there are no isolated nodes”. We claim that µ(P) =
1. For that, we show that µ( ¯P) = 0, where ¯P is: “there is an isolated node”.
To calculate µn( ¯P), note that there are n ways to choose an isolated node,
and 2(n−1
2 ) ways to put edges on the remaining nodes. Hence
µn( ¯P) ≤
n · 2(n−1
2 )
2(n
2)
=
n
2n−1
,
and thus µ( ¯P) = 0.
• Let P be the property of being connected. Again, we show that µ( ¯P) = 0,
and thus the asymptotic probability of graph connectivity is 1.
To calculate µ( ¯P), we have to count the number of graphs with at least
two connected components. Assuming the size of one component is k,
– there are n
k ways to choose a subset X ⊆ {0, . . ., n − 1};
– there are 2(k
2) ways to put edges on X; and
– there are 2(n−k
2 ) ways to put edges on the complement of X.
Hence,
µn( ¯P) ≤
n−1
k=1
n
k · 2(k
2) · 2(n−k
2 )
2(n
2)
=
n−1
k=1
n
k
2k2+kn
=
n
2n+1
+
n−1
k=2
n
k
2k2+kn
≤
n
2n+1
+
1
22n
·
n−1
k=2
n
k
≤
n
2n+1
+
1
2n
→ 0.
• Consider the query even. Then
µn(even) =
1 if n is even,
0 if n is odd.
Hence, µ(even) does not exist.
12.1 Asymptotic Probabilities and Zero-One Laws 237
• The last example is the parity query. If σ has a unary relation U, then A
satisﬁes parityU iﬀ |UA
| mod 2 = 0. Therefore,
µn(parityU ) =
k≤n, k even
n
k
,
and hence µ(parityU ) = 1
2 .
Thus, for some properties P, the asymptotic probability µ(P) is 0 or 1,
for some, like parity, µ(P) could be a number between 0 and 1, and for some,
like even, it may not even exist.
Deﬁnition 12.1 (Zero-one law). Let L be a logic. We say that it has the
zero-one law if for every property P (i.e., a Boolean query) deﬁnable in L,
either µ(P) = 0, or µ(P) = 1.
The ﬁrst property P for which we proved µ(P) = 1 was the absence of
isolated nodes: this property is FO-deﬁnable. Graph connectivity, which also
has asymptotic probability 1, is not FO-deﬁnable, but it is deﬁnable in LFP
and hence in Lω
∞ω. On the other hand, the even and parityU queries, which
violate the zero-one law, are not Lω
∞ω-deﬁnable, as we saw in Chap. 11. It
turns out that µ(P) is 0 or 1 for every property deﬁnable in Lω
∞ω.
Theorem 12.2. Lω
∞ω has the zero-one law.
Corollary 12.3. FO, LFP, IFP, and PFP all have the zero-one law.
Zero-one laws can be seen as statements that a logic cannot do nontrivial
counting. For example, if a logic L has the zero-one law, then even is not
expressible in it, as well as any divisibility properties (e.g., is the size of a
certain set congruent to q modulo p?), cardinality comparisons (e.g., is | X |
bigger than |Y |?), etc.
Note also that while LFP, IFP, PFP, and Lω
∞ω all have the zero-one law,
their extensions with ordering no longer have it, since LFP+ < deﬁnes even,
a Ptime query.
In the presence of a linear order (in fact, even successor), FO fails to have
the zero-one law too. To see this, let S be the successor relation, and consider
the sentence
∀x∀y ∀z ¬S(z, x) ∧ ¬S(y, z) → E(x, y) ,
saying that if x is the initial and y the ﬁnal element of the successor relation,
then there is an edge between them. Since this sentence states the existence
of one speciﬁc edge, its asymptotic probability is 1
2 .
We shall prove Theorem 12.2 in the next section after we introduce the
main tool for the proof: extension axioms.
238 12 Zero-One Laws
z
T S − T
Fig. 12.1. Extension axiom
12.2 Extension Axioms
Extension axioms are statements deﬁned as follows. Let S be a ﬁnite set of
cardinality n, and let T ⊆ S be of cardinality m. Then the extension axiom
EAn,m says that there exists z ∈ S such that for all x ∈ T , there is an edge
between z and x, and for all x ∈ S − T , there is no edge between z and x.
This is illustrated in Fig. 12.1.
Extension axioms can be expressed in FO in the language of graphs. In
fact, EAn,m is given by the following sentence:
∀x1, . . . , xn
i=j
xi = xj → ∃z






n
i=1
z = xi
∧
i≤m
E(z, xi)
∧
i>m
¬E(z, xj)






. (12.2)
The extension axiom EAn,m is vacuously true in a structure with fewer
than n elements, but we shall normally consider it in structures with at least
n elements.
We shall be using special cases of extension axioms, when |S| = 2k and
|T | is k. Such an extension axiom will be denoted by EAk. That is, EAk says
if X ∩ Y = ∅, and | X |=| Y |= k, then there is z such that there is an edge
(x, z) for all x ∈ X but there is no edge (y, z) for any y ∈ Y .
Proposition 12.4. µ(EAk) = 1 for each k.
Proof. We show instead that µ(¬EAk) = 0. Let n > 2k. Note that for EAk
to fail, there must be disjoint X and Y of cardinality k such that there is no
z ∈ X ∪ Y with E(x, z) for all x ∈ X and ¬E(y, z) for all y ∈ Y . We now
calculate µn(¬EAk), for n > 2k.
• There are n
k ways to choose X.
12.2 Extension Axioms 239
• There are n−k
k ways to choose Y . Therefore, there are at most n
k ·
n−k
k ≤ n2k
ways to choose X and Y .
• Since there are no restrictions on edges on X ∪ Y , there are 2(2k
2 ) ways to
put edges on X ∪ Y .
• Again, since there are no restrictions on edges outside of X ∪ Y , there are
2(n−2k
2 ) ways to put edges outside of X ∪ Y .
• The only restriction we have is on putting edges between X ∪ Y and its
complement X ∪ Y : for each of the n − 2k elements z ∈ X ∪ Y , we can
put edges between z and the 2k elements of X ∪ Y in every possible way
except one, where z is connected to every member of X and not connected
to any member of Y . Hence, for each z there are 22k
− 1 ways of putting
edges between z and X ∪ Y , and therefore the number of ways to put
edges between X ∪ Y and X ∪ Y is (22k
− 1)n−2k
.
Thus,
µn(¬EAk) ≤
n2k
· 2(2k
2 ) · 2(n−2k
2 ) · (22k
− 1)n−2k
2(n
2)
. (12.3)
A simple calculation shows that
2(2k
2 ) · 2(n−2k
2 )
2(n
2)
≤
1
22k(n−2k)
. (12.4)
Combining (12.3) and (12.4) we obtain
µn(¬EAk) ≤ n2k
· 1 −
1
22k
n−2k
→ 0,
proving that µ(¬EAk) = 0 and µ(EAk) = 1.
Corollary 12.5. µ(EAn,m) = 1, for any n and m ≤ n.
Proof. For graphs of size > 2n, EAn implies EAn,m for any m ≤ n.
Corollary 12.6. Each EAk has arbitrarily large ﬁnite models.
Notice that it is not immediately obvious from the statement of EAk that
there are ﬁnite graphs with at least 2k elements satisfying it. However, Proposition
12.4 tells us that we can ﬁnd such graphs; in fact, almost all graphs
satisfy EAk.
We now move to the proof of the zero-one law for Lω
∞ω. First, we need a
lemma.
240 12 Zero-One Laws
Lemma 12.7. Let G1, G2 be ﬁnite graphs such that G1, G2 |= EAn,m for all
m ≤ n ≤ k. Then G1 ≡∞ω
k G2.
Proof. The extension axioms provide the strategy. Suppose we have a position
in the game where (a1, . . . , ak) have been played in G1 and (b1, . . . , bk) in
G2. Let the spoiler move the ith pebble from ai to some element a. Let
I ⊆ {1, . . . , k} − {i} be all the indices such that there is an edge from a
to aj, for all j ∈ I. Then by the extension axioms we can ﬁnd b ∈ G2 such
that there is an edge from b to every bj, for j ∈ I, and there are no edges
from b to any bl, for l ∈ I. Hence, the duplicator can play b as the response
to a. This shows that the pebble game can continue indeﬁnitely, and thus
G1 ≡∞ω
k G2.
And ﬁnally, we prove the zero-one law. Let Φ be from Lk
∞ω. Suppose there
is a model G of EAk, of size at least 2k, that is also a model of Φ. Suppose G′
is a graph that satisﬁes EAk and has at least 2k elements. Then, by Lemma
12.7, we have G′
≡∞ω
k G and hence G′
|= Φ. Therefore, µ(ϕ) ≥ µ(EAk) = 1.
Conversely, assume that no model of EAk of size ≥ 2k is a model of Φ. Then
µ(Φ) ≤ µ(¬EAk) = 0.
We now revisit the example of graph connectivity, for which the asymptotic
probability was shown to be 1. If we look at EA2, then for graphs with at least
four nodes it implies that, for any x = y, there exists z such that E(x, z) and
E(y, z) hold. Hence, every graph with at least four nodes satisfying EA2 is
connected, and thus µ(connectivity) = 1.
As another example of using extension axioms for computing asymptotic
probabilities, consider EA2 and an edge (x, y). As before, we can ﬁnd a node
z such that E(x, z) and E(y, z) hold, and hence a graph satisfying EA2 has a
cycle (x, y, z). This means that µ(acyclicity) = 0.
Finally, we explain how to state the extension axioms for an arbitrary vocabulary
σ that contains only relation symbols. Given variables x1, . . . , xn,
let Aσ(x1, . . . , xn) be the collection of all atomic σ-formulae of the form
R(xi1 , . . . , xim ), where R ranges over relations from σ, and m is the arity
of R. Let F ⊆ Aσ(x1, . . . , xn). With F, we associate a formula χF (x1, . . . , xn)
(called a complete description) given by
ϕ∈F
ϕ ∧
ψ∈Aσ(x1,...,xn)−F
¬ψ.
That is, a complete description states precisely which atomic formulae in
x1, . . . , xn are true, and which are not.
Let F now be a subset of Aσ(x1, . . . , xn), and G a subset of
Aσ(x1, . . . , xn, xn+1) such that G extends F; that is, F ⊆ G. Then the extension
axiom EAF,G is the sentence
12.3 The Random Graph 241
∀x1 . . . xn




i=j
xi = xj ∧ χF (x1, . . . , xn) →
∃xn+1
i≤n
xn+1 = xi ∧ χG(x1, . . . , xn)




saying that every complete description in n variables can be extended to every
consistent complete description in n + 1 variables. A similar argument shows
that µ(EAF,G) = 1. Therefore, the zero-one law holds for arbitrary ﬁnite
structures, not only graphs.
12.3 The Random Graph
In this section we deal with a certain inﬁnite structure. This structure, called
the random graph, has an interesting FO theory: it consists of precisely all the
sentences Φ for which µ(Φ) = 1. By analyzing the random graph, we prove
that it is decidable, for an FO sentence Φ, whether µ(Φ) = 1.
First, recall the BIT predicate: BIT(i, j) is true iﬀ the jth bit of the binary
expansion of i is 1.
Deﬁnition 12.8. The random graph is deﬁned as the inﬁnite (undirected)
graph RG = N, E where there is an edge between i and j, for j < i, iﬀ
BIT(i, j) is true.
Why is this graph called random? After all, the construction is completely
deterministic. It turns out there is a probabilistic construction that results
in this graph. Suppose someone wants to randomly build a countable graph
whose nodes are natural numbers. When reaching a new node n, this person
would look at all nodes k < n, and for each of them will toss a coin to
decide if there is an edge between k and n. What kind of graph does one get
as the result? It turns out that with probability 1, the constructed graph is
isomorphic to RG.
However, for our purposes, we do not need the probabilistic construction.
What is important to us is that the random graph satisﬁes all the extension
axioms. Indeed, to see that RG |= EAn,m, let S ⊂ N be of size n and X ⊆ S be
of size m. Let l be a number which, when given in binary, has ones in positions
from X, and zeros in positions from S − X. Furthermore, assume that l has
a one in some position whose number is higher than the maximal number
in S. Then l witnesses EAn,m for S and T . To give a concrete example, if
S = {0, 1, 2, 3, 4} and X = {0, 2, 3}, then the number l is 45, or 101101 in
binary.
Next, we deﬁne a theory
EA = {EAk | k ∈ N}. (12.5)
Recall that a theory T (a set of sentences over vocabulary σ) is complete if
for each sentence Φ, either T |= Φ or T |= ¬Φ; it is ω-categorical if, up to
242 12 Zero-One Laws
isomorphism, it has only one countable model, and decidable, if it is decidable
whether T |= Φ.
Theorem 12.9. EA is complete, ω-categorical, and decidable.
Proof. For ω-categoricity, we claim that up to isomorphism, RG is the only
countable model of EA. Suppose that G is another model of EA (and
thus it satisﬁes all the extension axioms EAn,m). We claim that RG ≡ω G;
that is, the duplicator can play countably many moves of the EhrenfeuchtFra¨ıss´e
game on RG and G. Indeed, suppose after round r we have a position
((a1, . . . , ar), (b1, . . . , br)) deﬁning a partial isomorphism, and suppose the
spoiler plays ar+1 in RG. Let I = {i ≤ r | RG |= E(ar+1, ai)}. Since G |= EA,
by the appropriate extension axiom we can ﬁnd br+1 such that G |= E(br+1, bi)
iﬀ i ∈ I. Thus, the resulting position ((a1, . . . , ar, ar+1), (b1, . . . , br, br+1)) still
deﬁnes a partial isomorphism.
If we have two countable structures such that A ≡ω B, then A ∼= B.
Indeed, if A = {ai | i ∈ N} and B = {bi | i ∈ N}, let the spoiler play, in
each even round, the smallest unused element of A, and in each odd round
the smallest unused element of B. Then the union of the sequence of partial
isomorphisms generated by this play is an isomorphism between A and B.
Thus, we have shown that G |= EA implies G ∼= RG and hence EA is
ω-categorical.
The next step is to show completeness of EA. Suppose that we have a
sentence Φ such that neither EA |= Φ nor EA |= ¬Φ. Thus, both theories
EA∪{Φ} and EA∪{¬Φ} are consistent. By the L¨owenheim-Skolem theorem,
we get two countable models G′
, G′′
of EA such that G′
|= Φ and G′′
|= ¬Φ.
However, by ω-categoricity, this means that G′ ∼= G′′ ∼= RG. This contradiction
proves that EA is complete.
Finally, a classical result in model theory says that a recursively axiomatizable
complete theory is decidable. Since (12.5) provides a recursive axiomatization,
we conclude that EA is decidable.
Corollary 12.10. If Φ is an FO sentence, then RG |= Φ iﬀ µ(Φ) = 1.
Proof. Let RG |= Φ. Since EA is complete, EA |= Φ, and hence, by compactness,
for some k > 0, {EAi | i ≤ k} |= Φ. Thus, EAk |= Φ and hence
µ(Φ) ≥ µ(EAk) = 1. Conversely, if RG |= ¬Φ, then µ(¬Φ) = 1 and µ(Φ) = 0.
Hence, for any Φ with µ(Φ) = 1, we have RG |= ϕ.
Combining Corollary 12.10 and decidability of EA, we obtain the follow-
ing.
Corollary 12.11. For an FO sentence Φ it is decidable whether µ(Φ) = 1.
Thus, Trakhtenbrot’s theorem tells us that it is undecidable whether a
sentence is true in all ﬁnite models, but now we see that it is decidable whether
a sentence is true in almost all ﬁnite models.
12.4 Zero-One Law and Second-Order Logic 243
12.4 Zero-One Law and Second-Order Logic
We have proved the zero-one law for the ﬁnite variable logic Lω
∞ω and its
fragments such as FO and ﬁxed point logics. It is natural to ask what other
logics have it. Since the zero-one law can be seen as a statement saying that
a logic cannot count, counting logics cannot have it. Another possibility is
second-order logic and its fragments. Even such a simple fragment as ∃SO,
the existential second-order logic, does not have the zero-one law: since ∃SO
equals NP, the query even is in ∃SO. But we shall see that some nontrivial
restrictions of ∃SO have the zero-one law.
One way to obtain such restrictions is to look at quantiﬁer preﬁxes of the
ﬁrst-order part. Recall that an ∃SO sentence can be written as
∃X1 . . . ∃XnQ1x1 . . . Qmxm ϕ(X1, . . . , Xn, x1, . . . , xm), (12.6)
where each Qi is ∀ or ∃, and ϕ is quantiﬁer-free. If r is a regular expression
over the alphabet {∃, ∀}, by ∃SO(r) we denote the set of all sentences (12.6)
such that the string Q1 . . . Qm is in the language denoted by r. For example,
∃SO(∃∗
∀∗
) is a fragment of ∃SO that consists of sentences (12.6) for which
the ﬁrst-order part has all existential quantiﬁers in front of the universal
quantiﬁers.
Theorem 12.12. ∃SO(∃∗
∀∗
) has the zero-one law.
Proof. To keep the notation simple, we shall prove this for undirected graphs,
but the result is true for arbitrary vocabularies that contain only relation
symbols. The result will follow from two lemmas.
Lemma 12.13. Let S1, . . . , Sm be relation symbols, and ϕ an FO sentence of
vocabulary {S1, . . . , Sm, E} such that
RG |= ∀S1 . . . ∀Sm ϕ(S1, . . . , Sm).
Then there is an FO sentence Φ of vocabulary {E} such that µ(Φ) = 1 and
Φ → ∀S ϕ is a valid sentence.
Lemma 12.14. Let S1, . . . , Sm be relation symbols, and ϕ(x, y) a quantiﬁerfree
FO formula of vocabulary {S1, . . . , Sm, E} such that
RG |= ∃S1 . . . ∃Sm ∃x ∀y ϕ(S, x, y).
Then there is an FO sentence Ψ of vocabulary {E} such that µ(Φ) = 1 and
Φ → ∃S ∃x ∀y ϕ is a ﬁnitely valid sentence.
First, these lemmas imply the theorem. Indeed, assume that we are given
an ∃SO(∃∗
∀∗
) sentence Θ ≡ ∃S ∃x ∀y ϕ. Let RG |= Θ. Then, by Lemma
12.14, there is a sentence Φ with µ(Φ) = 1 such that Θ is true in every ﬁnite
244 12 Zero-One Laws
model of Φ, and hence µ(Θ) = 1. Conversely, assume RG |= ¬Θ. Since ¬Θ is
an ∀SO sentence, by Lemma 12.13 we ﬁnd a sentence Φ with µ(Φ) = 1 such
that ¬Θ is true in every model of Φ, and thus µ(¬Θ) = 1 and µ(Θ) = 0.
Hence, µ(Θ) is either 0 or 1. It remains to prove the lemmas.
Proof of Lemma 12.13. Assume that RG |= ∀Sϕ(S), but for every FO sentence
Φ with µ(Φ) = 1, it is the case that (Φ → ∀S ϕ) is not a valid sentence (i.e.,
Φ ∧ ∃S¬ϕ(S) has a model).
Consider the theory T = EA∪{¬ϕ} of vocabulary {S1, . . . , Sm, E}. Since
every ﬁnite conjunction of extension axioms has asymptotic probability 1, by
compactness we conclude that T is consistent, and by the L¨owenheim-Skolem
theorem, it has a countable model A. Since EA is ω-categorical, the {E}reduct
of A is isomorphic to RG. But then RG |= ∃S¬ϕ(S), a contradiction.
This proves Lemma 12.13.
Proof of Lemma 12.14. Let |S | = m and |x| = n. Let A1, . . . , Am witness the
second-order quantiﬁers, and let a1, . . . , an be the elements of RG witnessing
FO existential quantiﬁers. Let RG0 be the ﬁnite subgraph of RG with the
universe {a1, . . . , an}. We can ﬁnd ﬁnitely many extension axioms {EAk,l}
such that their conjunction implies the existence of a subgraph isomorphic
to RG0. Let Φ be the conjunction of all such extension axioms. Let A be
a ﬁnite model of Φ. By the extension axioms, there is a subgraph RGA of
RG that is isomorphic to A and contains RG0. Now we claim that RGA |=
∃S∃x∀y ϕ. To witness the second-order quantiﬁers, we take the restrictions of
the Ai’s to RGA; as witnesses of FO existential quantiﬁers we take a1, . . . , an.
Since universal sentences are preserved under substructures, we conclude that
RGA |= ∀y ϕ(A, a, y), and thus RGA |= ∃S∃x∀y ϕ. Therefore, A |= ∃S∃x∀y ϕ,
which proves the lemma.
There are more results concerning zero-one laws for fragments of SO, but
they are signiﬁcantly more complicated, and we present them without proofs.
One other preﬁx class which admits the zero-one law is ∃∗
∀∃∗
; that is, exactly
one universal quantiﬁer is present.
Theorem 12.15. ∃SO(∃∗
∀∃∗
) has the zero-one law.
Going to two universal quantiﬁers, however, creates problems.
Theorem 12.16. ∃SO(∀∀∃) does not have the zero-one law, even if the FO
part does not use equality.
For some preﬁx classes, the failure of the zero-one law is fairly easy to
show. Consider, for example, the sentence
∃S ∀x∃y∀z


S(x, y) ∧ ¬S(x, x)
∧ S(x, z) → y = z
∧ S(x, z) ↔ S(z, x)

 .
12.5 Almost Everywhere Equivalence of Logics 245
This in an ∃SO(∀∃∀) sentence saying there is a permutation S in which every
element has order 2; that is, this sentence expresses even and thus ∃SO(∀∃∀)
fails the zero-one law. A similar sentence can be written in ∃SO(∀∀∀∃). The result
can further be strengthened to show that both ∃SO(∀∃∀) and ∃SO(∀∀∀∃)
fail to have the zero-one law even if the FO order part does not mention
equality.
12.5 Almost Everywhere Equivalence of Logics
In this short section, we shall prove a somewhat surprising result that on
almost all structures, there is no diﬀerence between FO, LFP, PFP, and Lω
∞ω.
Deﬁnition 12.17. Given a logic L, its fragment L′
, and a vocabulary σ, we
say that L and L′
are almost everywhere equivalent over σ, if there is a class
C of ﬁnite σ-structures such that µ(C) = 1 and for every L formula ϕ, there
is an L′
formula ψ such that ϕ and ψ coincide on structures from C.
Theorem 12.18. Lω
∞ω and FO are almost everywhere equivalent over σ, for
any purely relational vocabulary σ.
Proof sketch. For simplicity, we deal with undirected graphs. Let Ck be the
class of ﬁnite graphs satisfying EAk. We claim that on Ck, every Lk
∞ω formula
is equivalent to an FOk
formula. Indeed, for a tuple a = (a1, . . . , ak) in a
structure A ∈ Ck, its FOk
type tpFOk (A, a) is completely determined by the
atomic type of a; that is, by the atomic formulae E(ai, aj) that hold for a.
To see this, notice that if a and b have the same atomic type, then (a, b) is a
partial isomorphism, and by EAk from the position (a, b) the duplicator can
play indeﬁnitely in the k-pebble game; hence, (A, a) ≡∞ω
k (A, b).
Therefore, there are only ﬁnitely many FOk
types, and each Lk
∞ω formula
is a disjunction of those, and thus equivalent to an FOk
formula. (In fact, we
proved a stronger statement that on Ck, every Lk
∞ω formula is equivalent to
a quantiﬁer-free FOk
formula.)
We now consider the classes C1 ⊆ C2 ⊆ . . ., and observe that since each
µ(Ck) is 1, then for any sequence ǫ1 > ǫ2 > . . . > 0 such that limn→∞ ǫn = 0,
we can ﬁnd an increasing sequence of numbers n1 < n2 < . . . < nk < . . . such
that
µn(Ck ∩ Grn) > 1 − ǫk, for n > nk.
We then deﬁne
C = A ∈ STRUCT[{E}] if |A| ≥ nk, then A ∈ Ck .
One can easily check that µ(C) = 1. We claim that every Lω
∞ω formula is
equivalent to an FO formula on C. Indeed, let ϕ be an Lk
∞ω formula. We know
that on Ck, it is equivalent to an FOk
formula ϕ′
. Thus, to ﬁnd a formula ψ
246 12 Zero-One Laws
to which ϕ is equivalent on C, one explicitly enumerates all the structures of
cardinality up to nk and evaluates ϕ on them. Then, one writes an FO formula
ψk saying that if A is one of the structures with |A| < nk, then ψk(A) = ϕ(A),
and for all the structures with |A| ≥ nk, ψk agrees with ϕ′
. Since the number
of structures of cardinality up to nk is ﬁxed, this can be done in FO.
This result has complexity-theoretic implications. While we know that
LFP and PFP queries have respectively Ptime and Pspace data complexity,
Theorem 12.18 shows that their complexity can be reduced to AC0
on almost
all structures.
12.6 Bibliographic Notes
That FO has the zero-one law was proved ﬁrst by Glebskii et al. [92] in 1969,
and independently by Fagin (announced in 1972, but the journal version [73]
appeared in 1976). Fagin used extension axioms introduced by Gaifman [87].
Blass, Gurevich, and Kozen [22] and – independently – Talanov and Knyazev
[227] proved that LFP has the zero-one law, and the result for Lω
∞ω is due to
Kolaitis and Vardi [152].
The random graph was discovered by Erd¨os and R´enyi [67] (the probabilistic
construction); the deterministic construction used here is due to Rado
[203]. In fact, RG is sometimes referred to as the Rado graph. This is also a
standard construction in model theory (the Fra¨ıss´e limit of ﬁnite graphs, see
[125]). The results about the theory of the random graph are from Gaifman
[87]. Fagin [74] oﬀers some additional insights into the history of extension axioms.
The fact that the inﬁnite Ehrenfeucht-Fra¨ıss´e game implies isomorphism
of countable structures is from Karp [143].
The study of the zero-one law for fragments of ∃SO was initiated by Kolaitis
and Vardi [150], where they proved Theorem 12.12. Theorem 12.15 is
from Kolaitis and Vardi [151], and Theorem 12.16 is from Le Bars [163]. A
good survey on zero-one laws and SO is Kolaitis and Vardi [155] (in particular,
it explains how to prove that the zero-one law fails for ∃SO(∀∃∀) and
∃SO(∀∀∀∃) without equality).
Theorem 12.18 is from Hella, Kolaitis, and Luosto [122]. For related results
in the context of database query evaluation, see Abiteboul, Compton, and
Vianu [1].
Sources for exercises:
Exercises 12.3 and 12.4: Fagin [73]
Exercise 12.5: Lynch [173]
Exercise 12.6: Kaufmann and Shelah [144] and Le Bars [164]
Exercise 12.7: Grandjean [104]
Exercise 12.8: Hodges [125]
Exercise 12.9 (b): Cameron [31]
12.7 Exercises 247
Exercise 12.11: Le Bars [163]
Exercise 12.12: Kolaitis and Vardi [150]
Kolaitis and Vardi [155]
Blass, Gurevich, and Kozen [22]
12.7 Exercises
Exercise 12.1. Calculate µ(P) for the following properties P:
• rigidity;
• 2-colorability;
• being a tree;
• Hamiltonicity;
• having diameter 2.
Exercise 12.2. Prove the zero-one law for arbitrary vocabularies, using extension
axioms EAF,G.
Exercise 12.3. Instead of µn(P), consider νn(P) as the ratio of the number of
diﬀerent isomorphism types of graphs on {0, . . . , n−1} that have P and the number
of all diﬀerent isomorphism types of graphs on {0, . . . , n−1}. Let ν(P) be deﬁned as
the limit of νn(P). Prove that if P is an FO-deﬁnable property, then ν(P) = µ(P),
and thus is either 0 or 1.
Exercise 12.4. If constant or function symbols are allowed in the vocabulary, the
zero-one law may not be true. Speciﬁcally, prove that:
• if c is a constant symbol and U a unary predicate symbol, then U(c) has asymptotic
probability 1
2
;
• if f is a unary function symbol, then ∀x ¬(x = f(x)) has asymptotic probability
1
e
.
Exercise 12.5. Instead of the usual successor relation, consider a circular successor:
a relation of the form {(a1, a2), (a2, a3), . . . , (an−1, an), (an, a1)}. Prove that in the
presence of a circular successor, FO continues to have the zero-one law.
Exercise 12.6. Prove that MSO does not have the zero-one law.
Hint: choose a vocabulary σ to consist of several binary relations, and prove that
there is an FO formula ϕ(x, y) of vocabulary σ ∪ {U}, where U is unary, such that
the MSO sentence ∃U ϕ′
almost surely holds, where ϕ′
states that the set of pairs
for (x, y) for which ϕ(x, y) holds is a linear ordering.
Then the failure of the zero-one law follows since we know that MSO+ < can
deﬁne even.
Prove a stronger version of this failure, for the vocabulary of one binary relation.
Exercise 12.7. Prove that for vocabularies with bounded arities, the problem of
deciding whether µ(Φ) = 1, where Φ is FO, is Pspace-complete.
Exercise 12.8. Prove that the random graph admits quantiﬁer elimination: that is,
every formula ϕ(x) is equivalent to a quantiﬁer-free formula ϕ′
(x).
248 12 Zero-One Laws
Exercise 12.9. (a) Consider the following undirected graph G: its universe is N+ =
{n ∈ N | n > 0} and there is an edge between n and m, for n > m, iﬀ n is
divisible by pm, the mth prime. Prove that G is isomorphic to the random graph
RG.
Hint: the proof does not require any number theory, and is a simple application
of extension axioms.
(b) Consider another countable graph G′
whose universe is the set of primes congruent
to 1 modulo 4. Put an edge between p and q if p is a quadratic residue
modulo q. Prove that G′
is isomorphic to the random graph RG.
Exercise 12.10. Let Φ be an arbitrary ∃SO sentence. Prove that it is undecidable
whether µ(Φ) = 1.
Exercise 12.11. Prove that the restriction of ∃SO, where the ﬁrst-order part is a
formula of FO2
, does not have the zero-one law.
Exercise 12.12. Prove that for vocabularies with bounded arities, the problem of
deciding whether µ(Φ) = 1 is
• Nexptime-complete, if Φ is an ∃SO(∃∗
∀∗
) sentence, or an ∃SO(∃∗
∀∃∗
) sentence;
• Exptime-complete, if Φ is an LFP sentence.
Exercise 12.13.∗
Does ∃SO(∀∀∃) have the zero-one law over graphs?
13
Embedded Finite Models
In ﬁnite model theory, we deal with logics over ﬁnite structures. In embedded
ﬁnite model theory, we deal with logics over ﬁnite structures embedded into
inﬁnite ones. For example, one assumes that nodes of graphs are numbers,
and writes sentences like
∃x∃y E(x, y) ∧ (x · y = x · x + 1)
saying that there is an edge (x, y) in a graph with xy = x2
+ 1. The inﬁnite
structure in this case could be R, +, · , or N, +, · , or Q, +, · .
What kinds of queries can one write in this setting? We shall see in this
chapter that the answer depends heavily on the properties of the inﬁnite
structure into which the ﬁnite structures are embedded: for example, queries
such as even and graph connectivity turn out to be expressible on structures
embedded into N, +, · , or Q, +, · , but not R, +, · .
The main motivation for embedded ﬁnite models comes from database theory.
Relational calculus – that is, FO – is the basic relational query language.
However, databases store interpreted elements such as numbers or strings,
and queries in all practical languages use domain-speciﬁc operations, like
arithmetic operations for numbers, or concatenation and preﬁx comparison
for strings, etc. Embedded ﬁnite model theory studies precisely these kinds
of languages over ﬁnite models, where the underlying domain is potentially
inﬁnite, and operations over that domain can be used in formulae.
13.1 Embedded Finite Models: the Setting
Assume that we have two vocabularies, Ω and σ, where σ is ﬁnite and relational.
Let M be an inﬁnite Ω-structure U, Ω , where U is an inﬁnite set.
For example, if Ω contains two binary functions + and ·, then R, +, · and
N, +, · are two possible inﬁnite Ω-structures, with + and · interpreted, in
both cases, as addition and multiplication respectively.
250 13 Embedded Finite Models
Deﬁnition 13.1. Let M = U, Ω be an inﬁnite Ω-structure, and let σ =
{R1, . . . , Rm}. Suppose the arity of each Ri is pi > 0. Then an embedded
ﬁnite model (i.e., a σ-structure embedded into M) is a structure
A = A, RA
1 , . . . , RA
l ,
where each RA
i is a ﬁnite subset of Upi
, and A is the set of all the elements
of U that occur in the relations RA
1 , . . . , RA
l . The set A is called the active
domain of A, and is denoted by adom(A).
So far this is not that much diﬀerent from the usual ﬁnite model, except
that the universe comes from a given inﬁnite set U. What makes the setting
diﬀerent, however, is the presence of the underlying structure M, which makes
it possible to use rich logics for deﬁning queries on embedded ﬁnite models.
That is, instead of just FO over A, we shall use FO over
(M, A) = (U, Ω, RA
1 , . . . , RA
l ,
making use of operations available on M.
Before we deﬁne this logic, denoted by FO(M, σ), we shall address the
issue of quantiﬁcation. The universe of (M, A) is U, so saying ∃xϕ(x) means
that there is an element of U that witnesses ϕ. But while we are dealing with
ﬁnite structures A embedded into M, quantiﬁcation over the entire set U is
not always very convenient.
Consider, for example, the simple property of reﬂexivity. In the usual ﬁnite
model theory context, to state that a binary relation E is reﬂexive, we would
say ∀x E(x, x). However, if the interpretation of ∀x is “for all x ∈ U”, this
sentence would be false in all embedded ﬁnite models! What we really want
to say here is: “for all x in the active domain, E(x, x) holds”.
The deﬁnition of FO(M, σ) thus provides additional syntax to quantify
over elements of the active domain.
Deﬁnition 13.2. Given M = U, Ω and a relational vocabulary σ, ﬁrst-order
logic (FO) over M and σ, denoted by FO(M, σ), is deﬁned as follows:
• Any atomic FO formula in the language of M is an atomic FO(M, σ) formula.
For any p-ary symbol R from σ and terms t1, . . . , tp in the language
of M, R(t1, . . . , tp) is an atomic FO(M, σ) formula.
• Formulae of FO(M, σ) are closed under the Boolean connectives ∨, ∧, and
¬.
• If ϕ is an FO(M, σ) formula, then the following are FO(M, σ) formulae:
– ∃x ϕ,
– ∀x ϕ,
– ∃x∈adom ϕ, and
– ∀x∈adom ϕ.
13.1 Embedded Finite Models: the Setting 251
The class of ﬁrst-order formulae in the language of M will be denoted by
FO(M) (i.e., the formulae built up from atomic M-formulae by Boolean connectives
and quantiﬁcation ∃, ∀). The class of formulae not using the symbols
from Ω will be denoted by FO(σ) (in this case all four quantiﬁers are allowed).
The notions of free and bound variables are the usual ones. To deﬁne the
semantics, we need to deﬁne the relation (M, A) |= ϕ(a), for a formula ϕ(x)
and a tuple a over U of values of free variables. All the cases are standard,
except quantiﬁcation. If we have a formula ϕ(x, y), and a tuple of elements b
(values for y), then
(M, A) |= ∃x ϕ(x, b) iﬀ (M, A) |= ϕ(a, b) for some a ∈ U.
On the other hand,
(M, A) |= ∃x∈adom ϕ(x, b) iﬀ (M, A) |= ϕ(a, b) for some a ∈ adom(A).
The deﬁnitions for the universal quantiﬁcation are:
(M, A) |= ∀x ϕ(x, b) iﬀ (M, A) |= ϕ(a, b) for all a ∈ U
(M, A) |= ∀x∈adom ϕ(x, b) iﬀ (M, A) |= ϕ(a, b) for all a ∈ adom(A).
Since M is most of the time clear from the context, we shall often write
A |= ϕ(a) instead of the more formal (M, A) |= ϕ(a).
The quantiﬁers ∃x ∈ adom ϕ and ∀x ∈ adom ϕ are called active-domain
quantiﬁers. We shall sometimes refer to the usual quantiﬁes ∃ and ∀ as unrestricted
quantiﬁers.
From the point of view of expressive power, active-domain quantiﬁers are a
mere convenience: since adom(A) is deﬁnable with unrestricted quantiﬁcation,
so are these quantiﬁers. But we use them separately in order to deﬁne an
important sublogic of FO(M, σ).
Deﬁnition 13.3. By FOact(M, σ) we denote the fragment of FO(M, σ) that
only uses quantiﬁers ∃x ∈ adom and ∀x ∈ adom. Formulae in this fragment
are called the active-domain formulae.
Before moving on to the expressive power of FO(M, σ), we brieﬂy discuss
evaluation of such formulae. Since quantiﬁcation is no longer restricted to a
ﬁnite set, it is not clear a priori that formulae of FO(M, σ) can be evaluated –
and, indeed, in some cases there is no algorithm for evaluating them. However,
there is one special case when evaluation of formulae is “easy” (that is, easy
to explain, not necessarily easy to evaluate).
Suppose we have a sentence Φ of FO(M, σ), and an embedded ﬁnite model
A. We further assume that every element c ∈ adom(A) is deﬁnable over M:
that is, there is an FO(M) formula αc(x) such that M |= αc(x) iﬀ x = c.
In such a case, we replace every occurrence of an atomic formula
R(t1(x), . . . , tm(x)), where R ∈ σ and the ti’s are terms, by
252 13 Embedded Finite Models
(c1,...,cm)∈RA
αc1 (t1(x)) ∧ . . . ∧ αcm (tm(x)).
That is, we say that the tuple of values of the ti(x)’s is one of the tuples in
RA
. Thus, if ΦA
is the sentence obtained from Φ by such a replacement, then
(M, A) |= Φ ⇔ M |= ΦA
. (13.1)
Notice that ΦA
is an FO(M) sentence, since all the σ-relations disappeared.
Now using (13.1) we can propose the following evaluation algorithm: given Φ,
construct ΦA
, and check if M |= ΦA
. The last is possible if the theory of M is
decidable.
13.2 Analyzing Embedded Finite Models
When we brieﬂy looked at the standard model-theoretic techniques in Chap. 3,
we noticed that they are generally inapplicable in the setting of ﬁnite model
theory. For embedded ﬁnite models, we mix the ﬁnite and the inﬁnite: we study
logics over pairs (M, A), where M is inﬁnite and A is ﬁnite. So the question
arises: can we use techniques of either ﬁnite or inﬁnite model theory?
It turns out that we cannot use ﬁnite or inﬁnite model-theoretic techniques
directly; as we are about to show, in general, they fail over embedded ﬁnite
models. Then we outline a new kind of tools that is used with embedded
ﬁnite models: by using inﬁnite model-theoretic techniques, we reduce questions
about embedded ﬁnite models to questions about ﬁnite models, for which the
preceding 12 chapters give us plenty of answers. In general, we shall see that
the behavior of FO(M, σ) depends heavily on model-theoretic properties of
the underlying structure M.
We now discuss standard (ﬁnite) model-theoretic tools and their applicability
to the study of embedded ﬁnite models.
First, notice that compactness fails over embedded ﬁnite models for the
same reason as for ﬁnite models. One can write sentences λn, n ≥ 0, stating
that adom(A) contains at least n elements. Then T = {λn | n ≥ 0} is ﬁnitely
consistent: every ﬁnite set of sentences has a ﬁnite model. However, T itself
does not have a ﬁnite model.
One tool that deﬁnitely applies in the embedded setting is EhrenfeuchtFra¨ıss´e
games. However, playing a game is very hard. Assume, for example,
that M is the real ﬁeld R, +, · . Suppose σ is empty, and we want to show
that the query even, testing if |adom(A)| is even, is not expressible (which,
as we shall see later, is a true statement). As in the proof given in Chap. 3,
suppose even is expressible by a sentence Φ of quantiﬁer rank k. Before, we
picked two structures, A1 of cardinality k and A2 of cardinality k + 1, and
showed that A1 ≡k A2. Our problem now is that showing A1 ≡k A2 no longer
suﬃces, as we have to prove
13.2 Analyzing Embedded Finite Models 253
(M, A1) ≡k (M, A2) (13.2)
instead. For example, in the old strategy for winning the game on A1 and
A2, if the spoiler plays any point a1 in A1 in the ﬁrst move, the duplicator
can respond by any point A2. But now we have to account for additional
atomic formulae such as p(x) = 0, where p is a polynomial. So if we know
that p(a1) = 0 for some given p, the strategy must also ensure that p(a2) = 0.
It is not at all clear how one can play a game like that, to satisfy (13.2).
The next obvious approach is to try ﬁnite model-theoretic techniques that
avoid Ehrenfeucht-Fra¨ıss´e games, such as locality and zero-one laws. This approach,
however, cannot be used for all structures M, as the following example
shows.
Let N be the well-known structure N, +, · ; that is, natural numbers with
the usual arithmetic operations. A σ-structure over N is a σ-structure whose
active domain is a ﬁnite subset of N, and hence it can be encoded by some
reasonable encoding (e.g., a slight modiﬁcation of the encoding of Chap. 6,
where in addition all numbers in the active domain are encoded in binary).
A Boolean query on σ-structures embedded into N is a function Q from such
structures into {true, false}. It is computable if there is a computable function
fQ : {0, 1}∗
→ {0, 1} such that fQ(s) = 1 iﬀ s is an encoding of a structure A
such that Q(A) = true.
Proposition 13.4. Every computable Boolean query on σ-structures embedded
into N can be expressed in FO(N, σ).
Proof. Without loss of generality, we assume that σ contains a single binary
relation E. We use the following well-known fact about N: every computable
predicate P ⊆ Nm
is deﬁnable by an FO(N) formula, which we shall denote
by ψP (x1, . . . , xm). The idea of the proof then is to code ﬁnite σ-structures
with numbers. For a query Q, the sentence deﬁning it will be
ΦQ = ∃x χ(x) ∧ ψPQ (x) , (13.3)
where χ(x) says that the input structure A is coded by the number x, and
the predicate PQ is the computable predicate such that PQ(n) holds iﬀ n is
the code of a structure A with Q(A) = true.
Thus, we have to show how to code structures. Let pn denote the nth
prime, with the numeration starting at p0 = 2. Suppose we have a structure
A with adom(A) = {n1, . . . , nk}. We ﬁrst code the active domain by
code0(A) =
k
i=1
pni .
There is a formula χ0(x) of FO(N, σ) such that A |= χ0(n) iﬀ code0(A) = n.
Such a formula states the following condition:
254 13 Embedded Finite Models
• for each l ∈ adom(A), n is divisible by pl but not divisible by p2
l , and
• if n is divisible by a prime number p, then p is of the form pl for some
l ∈ adom(A).
Since the binary relation {(n, pn) | n ≥ 0} is computable and thus deﬁnable
in FO(N), χ0 can be expressed as an FO(N, σ) formula.
We next code the edge relation E. Let pair : N × N → N be the standard
pairing function. We then code EA
by
code1(A) =
(ni,nj )∈EA
ppair(ni,nj ).
As in the case of coding the active domain, there exists a formula χ1(x) such
that A |= χ1(n) iﬀ code1(A) = n – the proof is the same as for χ0. Finally, we
code the whole structure by
code(A) = pair code0(A), code1(A) .
Clearly, A = B implies code(A) = code(B), so we did deﬁne a coding function.
Moreover, since χ0 and χ1 are FO(N, σ) formulae, the formula χ(x) can be
deﬁned as ∃y∃z χ0(y)∧χ1(z)∧ψP (y, z, x), where P is the graph of the pairing
function. This completes the coding scheme, and thus shows that (13.3) deﬁnes
Q on structures embedded into N.
Therefore, in FO(N, σ) we can express queries that violate locality notions
(e.g., connectivity) and queries that do not obey the zero-one law (e.g., parity).
Hence, we need a totally diﬀerent set of techniques for proving bounds on
the expressive power of FO(M, σ). If we want to prove results about FO(M, σ),
perhaps we can relate this logic to something we know how to deal with:
the pure ﬁnite model theory setting. In our new terminology, this would be
FOact(U∅, σ), where U∅ = U, ∅ is a structure of the empty vocabulary. That
is, there are no functions or predicates from M used in formulae, and all
quantiﬁcation is restricted to the ﬁnite universe adom(A). (Notice that the
setting of FOact(U∅, σ) is in fact a bit more restrictive than the usual ﬁnite
model theory setting: in the latter, we quantify over a ﬁnite universe that may
be larger than the active domain.)
For technical reasons that will become clear a bit later, we shall deal not
with U∅ but rather with U< = U, < , where < is a linear order on U. Then
FOact(U<, σ) corresponds to what we called FO+< in the ﬁnite model theory
setting. We know a number of results about this logic: in particular, it cannot
express the query even (Theorem 3.6) nor can it express graph connectivity
(Theorem 5.8).
We now present the ﬁrst of our two new tools. First, we need the following.
Suppose Ω′
expands Ω by adding some (perhaps inﬁnitely many) predicate
symbols. We call a structure M′
= U, Ω′
a deﬁnitional expansion of M =
U, Ω if for every predicate P ∈ Ω′
− Ω, there exists a formula ψP (x) in the
language of M such that PM′
= {a | M |= ψP (a)}.
13.2 Analyzing Embedded Finite Models 255
Deﬁnition 13.5. We say that M admits the restricted quantiﬁer collapse, or
RQC, if there exists a deﬁnitional expansion M′
of M such that
FO(M, σ) = FOact(M′
, σ)
for every σ.
The notion of RQC can be formulated without using a deﬁnitional expansion
as follows. For every FO(M, σ) formula ϕ(x), there is an equivalent
formula ϕ′
(x) such that no σ-relation appears within the scope of an unrestricted
quantiﬁer ∃ or ∀ (i.e., σ-relations only appear within the scope of
restricted quantiﬁers ∃x∈adom and ∀x∈adom).
There is one special form of the restricted quantiﬁer collapse, which arises
for structures M that have the collapse and also have quantiﬁer elimination
(that is, every FO(M) formula is equivalent to a quantiﬁer-free one). In this
case, if FOact(M′
, σ) refers to a deﬁnable predicate P ∈ Ω′
−Ω, we know that
P is deﬁnable by a quantiﬁer-free formula over M. Hence, using the deﬁnition
of P, we obtain an equivalent FO(M, σ) formula. Thus, we have:
Proposition 13.6. If M admits the restricted quantiﬁer collapse (RQC) and
has quantiﬁer elimination, then
FO(M, σ) = FOact(M, σ). (13.4)
The condition in (13.4) is usually called the natural-active collapse, since
the standard unrestricted interpretation of quantiﬁers is sometimes called the
“natural interpretation”.
Using RQC, or the natural-active collapse, eliminates quantiﬁcation outside
of the active domain. To reduce the expressiveness of FO(M, σ) to that
of FOact(U<, σ), we would also like to eliminate all references to M functions
and predicates, except possibly order. This, however, in general is impossible:
how could one express a query like ∃x∈adom∃y∈adom E(x, y)∧x·y = x+1?
To deal with this problem, we use the notion of genericity which comes
from the classical relational database setting. Informally, it states the following:
when one evaluates formulae on embedded ﬁnite models, exact values
of elements in the active domain do not matter. For example, the answer
to the query “Does a graph have diameter 2?” is the same for the graph
{(1, 2), (1, 3), (1, 4)} and for the graph {(5, 10), (5, 15), (5, 20)}, which is obtained
by the mapping 1 → 5, 2 → 10, 3 → 15, 4 → 20.
In general, generic queries commute with permutations of the universe.
Queries expressible in FO(M, σ) need not be generic: for example, the query
given by ∃x∈adom∃y∈adom E(x, y)∧x·y = x+1 is true on E = {(1, 2)} but
false on E = {(1, 3)}. However, as all queries deﬁnable in standard logics over
ﬁnite structures are generic, to reduce questions about FO(M, σ) to those in
ordinary ﬁnite model theory, it suﬃces to restrict one’s attention to generic
queries.
256 13 Embedded Finite Models
We now deﬁne genericity for queries (which map a ﬁnite σ-structure A to
a ﬁnite subset of Am
, m ≥ 0). Given a function π : U → U, we extend it to
ﬁnite σ-structures A by replacing each occurrence of a ∈ adom(A) with π(a).
Deﬁnition 13.7. • A query Q is generic if for every partial injective function
π : U → U which is deﬁned on adom(A), it is the case that
Q(A) = Q(π(A)).
• The class of generic queries deﬁnable in FO(M, σ) or FOact(M, σ) is denoted
by FOgen
(M, σ) or FOgen
act (M, σ), respectively.
While it is undecidable in general if an FO(M, σ) query is generic, most
queries whose inexpressibility we want to prove are generic.
Deﬁnition 13.8. We say that M admits the active-generic collapse, if
FOgen
act (M, σ) ⊆ FOact(U<, σ).
Now using the diﬀerent notions of collapse together, we come up with the
following methodology of proving bounds on FO(M, σ).
Proposition 13.9. Let M admit both the restricted-quantiﬁer collapse (RQC)
and the active-generic collapse. Then every generic query expressible in
FO(M, σ) is also expressible in FOact(U<, σ).
For example, it would follow from Theorem 3.6 that for M as in the proposition
above, even is not expressible in FO(M, σ). Furthermore, for such M,
every query in FOgen
(M, σ) is Gaifman-local, by Proposition 13.9 and Theorem
5.8.
Thus, our next goal is to see for what structures collapse results can be
established. We start with the active-generic collapse, and prove, in the next
section, that it holds for all structures.
The situation with RQC is not nearly as simple. We shall see that it fails for
N, +, · and Q, +, · , but we shall prove it for the ordered real ﬁeld R, +, ·, <
, 0, 1 . This structure motivated much of the initial work on embedded ﬁnite
models due to its database applications; this will be explained in Sect. 13.6.
More examples of RQC (or its failure) are given in the exercises. We shall also
revisit the random graph of the previous chapter and relate queries over it to
those deﬁnable in MSO.
13.3 Active-Generic Collapse
Our goal is to prove the following result.
Theorem 13.10. Every inﬁnite structure M admits the active-generic col-
lapse.
13.3 Active-Generic Collapse 257
We shall assume that M is ordered: that is, one of its predicates is <
interpreted as a linear order on its universe U. If this were not the case, we
could have expanded M to M< by adding a linear order. Since FO(M, σ) ⊆
FO(M<, σ), the active-generic collapse for M< would imply the collapse for
M:
FOgen
act (M, σ) ⊆ FOgen
act (M<, σ) ⊆ FOact(U<, σ).
The idea behind the proof of Theorem 13.10 is as follows: we show that
for each formula, its behavior on some inﬁnite set is described by a ﬁrst-order
formula which only uses < and no other symbol from the vocabulary of M.
This is called the Ramsey property. We then show how genericity and the
Ramsey property imply the collapse.
Deﬁnition 13.11. Let M = U, Ω be an ordered structure. We say that an
FOact(M, σ) formula ϕ(x) has the Ramsey property if the following is true:
Let X be an inﬁnite subset of U. Then there exists an inﬁnite set
Y ⊆ X and an FOact(U<, σ) formula ψ(x) such that for every σstructure
A with adom(A) ⊂ Y , and for every a over Y , it is the case
that A |= ϕ(a) ↔ ψ(a).
We now prove the Ramsey property for an arbitrary ordered M. The
following simple lemma will often be used as a ﬁrst step in proofs of collapse
results. Before stating it, note that for an FO(M, σ) formula (x = y) can
be viewed as both an atomic FO(σ) formula and an atomic FO(M) formula.
We choose to view it as an atomic FO(M) formula; that is, atomic FO(σ)
formulae are only those of the form R(· · · ) for R ∈ σ.
Lemma 13.12. Let ϕ(x) be an FO(M, σ) formula. Then there exists an equivalent
formula ψ(x) such that every atomic subformula of ψ is either an FO(σ)
formula, or an FO(M) formula. Furthermore, it can be assumed that none of
the free variables x occurs in an FO(σ)-atomic subformula of ψ(x). If ϕ is an
FOact(M, σ) formula, then ψ is also an FOact(M, σ) formula.
Proof. Introduce m fresh variables z1, . . . , zm, where m is the maximal arity of
a relation in σ, and replace any atomic formula of the form R(t1(y), . . . , tl(y)),
where l ≤ m and the ti’s are M-terms, by ∃z1 ∈adom . . . ∃zl ∈adom i(zi =
ti(y))∧R(z1, . . . , zl). Similarly use existential quantiﬁers to eliminate the free
x-variables from FO(σ)-atomic formulae.
The key in the inductive proof of the Ramsey property is the case of
FO(M) subformulae. For this, we ﬁrst recall the inﬁnite version of Ramsey’s
theorem, in the form most convenient for our purposes.
Theorem 13.13 (Ramsey). Given an inﬁnite ordered set X, and any partition
of the set of all ordered m-tuples x1, . . . , xm , x1 < . . . < xm, of elements
of X into l classes A1, . . . , Al, there exists an inﬁnite subset Y ⊆ X such that
all ordered m-tuples of elements of Y belong to the same class Ai.
258 13 Embedded Finite Models
The following is a standard model-theoretic result that we prove here for
the sake of completeness.
Lemma 13.14. Let ϕ(x) be an FO(M) formula. Then ϕ has the Ramsey
property.
Proof. Consider a (ﬁnite) enumeration of all the ways in which the variables x
may appear in the order of U. For example, if x = (x1, . . . , x4), one possibility
is x1 = x3, x2 = x4, and x1 < x2. Let P be such an arrangement, and
ζ(P) a ﬁrst-order formula that deﬁnes it (x1 = x3 ∧ x2 = x4 ∧ x1 < x2 in
the above example). Note that there are ﬁnitely many such arrangements P;
let P be the set of all of those. Each P induces an equivalence relation on
x: for example, {(x1, x3), (x2, x4)} for P above. Let xP
be a subtuple of x
containing a representative for each class (e.g., (x1, x4)) and let ϕP
(xP
) be
obtained from ϕ by replacing all variables from an equivalence class by the
chosen representative. Then ϕ(x) is equivalent to
P ∈P
ζ(P) ∧ ϕP
(xP
).
We now show the following. Let P′
⊆ P and P0 ∈ P′
. Let X ⊆ U be an
inﬁnite set. Assume that ψ(x) is given by
P ∈P′
ζ(P) ∧ ϕP
(xP
).
Then there exists an inﬁnite set Y ⊆ X and a quantiﬁer-free formula γP0 (x)
of the vocabulary {<} such that ψ is equivalent to
γP0 (x) ∨
P ∈P′−{P0}
ζ(P) ∧ ϕP
(xP
)
for tuples x of elements of Y .
To see this, suppose that P0 has m equivalence classes. Consider a partition
of tuples of Xm
ordered according to P0 into two classes: A1 of those tuples
for which ϕP0
(xP0
) is true, and A2 of those for which ϕP0
(xP0
) is false. By
Ramsey’s theorem, for some inﬁnite set Y ⊆ X either all ordered tuples
over Y m
are in A1, or all are in A2. In the ﬁrst case, ψ is equivalent to
ζ(P0) ∨ P ∈P′−{P0} ζ(P) ∧ ϕP
(xP
), and in the second case ψ is equivalent to
P ∈P′−{P0} ζ(P) ∧ ϕP
(xP
), proving the claim.
The lemma now follows by applying this claim inductively to every partition
P ∈ P, passing to smaller inﬁnite sets, while getting rid of all the formulae
containing symbols other than = and <. At the end we have an inﬁnite set
over which ϕ is equivalent to a quantiﬁer-free formula in the vocabulary {<}.
The next lemma lifts the Ramsey property from FO(M) formulae to arbitrary
FOact(M, σ) formulae.
13.3 Active-Generic Collapse 259
Lemma 13.15. Every FOact(M, σ) formula has the Ramsey property.
Proof. By Lemma 13.12, we assume that every atomic subformula is an
FOact(σ) formula or an FO(M) formula. The base cases for the induction
are the case of FOact(σ) formulae, where there is no need to change the formula
or ﬁnd a subset, and the case of FO(M) atomic formulae, which is given
by Lemma 13.14.
Let ϕ(x) ≡ ϕ1(x)∧ϕ2(x), where X ⊆ U is inﬁnite. First, ﬁnd ψ1, Y1 ⊆ X,
such that for every A and a over Y1, it is the case that A |= ϕ1(a) ↔ ψ1(a).
Next, by using the hypothesis for ϕ2 and Y1, ﬁnd an inﬁnite set Y2 ⊆ Y1 such
that for every A and a over Y2, it is the case that A |= ϕ2(a) ↔ ψ2(a). Then
take ψ ≡ ψ1 ∧ ψ2 and Y = Y2.
The case of ϕ = ¬ϕ′
is trivial.
For the existential case, let ϕ(x) ≡ ∃y∈adom ϕ1(y, x). By the hypothesis,
ﬁnd Y ⊆ X and ψ1(y, x) such that for every A and a over Y and every b ∈ Y
we have A |= ϕ1(b, a) ↔ ψ1(b, a). Let ψ(x) ≡ ∃y ∈ adom ψ1(y, x). Then, for
every A and a over Y , A |= ψ(a) iﬀ A |= ψ1(b, a) for some b ∈ adom(A) iﬀ
A |= ϕ1(b, a) for some b ∈ adom(A) iﬀ A |= ϕ1(a), thus ﬁnishing the proof.
To ﬁnish the proof of Theorem 13.10, we have to show the following.
Lemma 13.16. Assume that every FOact(M, σ) formula has the Ramsey
property. Then M admits the active-generic collapse.
Proof. Let Q be a generic query deﬁnable in FOact(M, σ). By the Ramsey
property, we ﬁnd an inﬁnite X ⊆ U and an FOact(U<, σ)-deﬁnable Q′
that
coincides with Q on X. We claim they coincide everywhere. Let A be a σstructure.
Since X is inﬁnite, there exists a partial monotone injective function
π from adom(A) into X such that for every pair of elements a < a′
of adom(A),
there exist x1, x2, x3 ∈ X − π(adom(A)) with the property that x1 < π(a) <
x2 < π(a′
) < x3.
By the genericity of Q, we have π(Q(A)) = Q(π(A)). Thus, Q(π(A)) coincides
with the restriction of Q′
(π(A)) to X. We now notice that Q′
does not
extend its active domain. Indeed, if adom(Q′
(π(A))) contained an element
b ∈ π(adom(A)), we could have replaced this element by b′
∈ X −π(adom(A))
such that for every a ∈ π(adom(A)), a < b iﬀ a < b′
. Since Q′
is FOact(U<, σ)deﬁnable,
this would imply that b′
∈ adom(Q′
(π(A))), which contradicts the
fact that over X, the queries Q and Q′
coincide.
Hence, π(Q(A)) = Q(π(A)) = Q′
(π(A)). Again, since Q′
is FOact(U<, σ)deﬁnable,
it commutes with any monotone injective map, and thus Q′
(π(A)) =
π(Q′
(A)). We have shown that π(Q(A)) = π(Q′
(A)), from which Q(A) =
Q′
(A) follows.
This completes the proof of Theorem 13.10.
Thus, no matter what functions and predicates there are in M, FO cannot
express more generic active-domain semantics queries over it than just
FOact(U<, σ). In particular, we have the following.
260 13 Embedded Finite Models
Corollary 13.17. Let M be an arbitrary structure. Then queries such as
even, parity, majority, connectivity, transitive closure, and acyclicity are
not deﬁnable in FOact(M, σ).
13.4 Restricted Quantiﬁer Collapse
One part of our program for establishing bounds on FO(M, σ) has been very
successful: we prove the active-generic collapse for arbitrary structures. Can
we hope to achieve the same success with the restricted-quantiﬁer collapse
(RQC)? The answer is clearly negative.
Corollary 13.18. The restricted-quantiﬁer collapse fails over N = N, +, · .
Proof. By Corollary 13.17, parity is not deﬁnable in FOact(N, σ), but by
Proposition 13.4, it is expressible in FO(N, σ).
Furthermore, RQC fails over Q, +, · , since it is possible to deﬁne natural
numbers within this structure, and then emulate the proof of Proposition 13.4
to show that every computable query is expressible.
However, the situation becomes very diﬀerent when we move to the real
numbers. We shall consider the real ordered ﬁeld: that is, the structure
R = R, +, ·, <, 0, 1 .
This is the structure that motivated much of the initial development in embedded
ﬁnite models, due to its close connections with questions about the
expressiveness of languages for geographical databases.
Consider the following FO(R, {E}) sentence, where E is a binary relation
symbol:
∃u∃v∀x∈adom∀y∈adom E(x, y) → y = u · x + v , (13.5)
saying that all elements of E ⊂ R2
lie on a line. Notice that it is essential
that the ﬁrst two quantiﬁers range over the entire set R. For example, if E is
interpreted as {(2, 2), (3, 3), (4, 4)}, then the sentence (13.5) is true, and the
witnesses for the existential quantiﬁers are u = 1 and v = 0. But neither 0
nor 1 is in the active domain of E.
Nevertheless, (13.5) can be expressed by an FOact(R, {E}) sentence. To
see this, notice that E lies on a line iﬀ every three points in E are collinear.
This can be expressed as
∀x1 ∈adom∀y1 ∈adom∀x2 ∈adom∀y2 ∈adom∀x3 ∈adom∀y3 ∈adom
E(x1, y1) ∧ E(x2, y2) ∧ E(x3, y3) → collinear(x, y) (13.6)
13.4 Restricted Quantiﬁer Collapse 261
where collinear(x, y) is a formula, over R, stating that (x1, y1), (x2, y2), and
(x3, y3) are collinear. It is easy to check that collinear(x, y) can be written as
a quantiﬁer-free formula (in fact, due to the quantiﬁer elimination for the real
ﬁeld, every formula over R is equivalent to a quantiﬁer-free formula, but the
condition for collinearity can easily be expressed directly). Hence, (13.6) is an
FOact(R, {E}) formula, equivalent to (13.5).
This example is an instance of a much more general result, stating that
the real ﬁeld R admits RQC. In fact, we show the natural-active collapse for
R (since R has quantiﬁer elimination). Moreover, the proof is constructive.
Theorem 13.19. The real ﬁeld R = R, +, ·, <, 0, 1 admits the restricted
quantiﬁer collapse. That is, for every FO(R, σ) formula ϕ(x), there is an
equivalent FOact(R, σ) formula ϕact(x). Moreover, there is an algorithm that
constructs ϕact from ϕ.
Proof. The proof of this result is by induction on the structure of the formula.
We shall always assume, by Lemma 13.12, that all atomic FO(σ) formulae are
of the form S(y), where y contains only variables. Thus, the base cases of the
induction are as follows:
• ϕ(x) is S(x). In this case ϕact ≡ ϕ.
• ϕ(x) is an atomic FO(R) formula. Again, ϕact ≡ ϕ in this case.
The cases of Boolean operations are simple:
• If ϕ ≡ ψ ∨ χ, then ϕact ≡ ψact ∨ χact;
• if ϕ ≡ ¬ψ, then ψact ≡ ¬ψact.
We now move to the case of an unrestricted existential quantiﬁer. We
shall ﬁrst treat the case of σ-structures A with adom(A) = ∅; at the end of
the proof, we shall explain how to deal with empty structures.
Suppose ϕ(x) ≡ ∃z β(x, z). By the induction hypothesis, β can be assumed
to be of the form
β(x, z) ≡ Qy1 ∈adom . . . Qym ∈adom BC αi(x, y, z) ,
where each Q is either ∃ or ∀, and:
1. BC αi(x, y, z) is a Boolean combination of atomic formulae α1, . . . , αs;
2. each FO(σ) atomic formula is of the form S(u), where u ⊆ y;
3. all atomic FO(R) formulae are of the form p(x, y, z) = 0 or p(x, y, z) > 0,
where p is a polynomial; and
262 13 Embedded Finite Models
4. n, m > 0, and at least one of the FO(R) atomic formulae involves a
multivariate polynomial p(x, y, z) = yi − z for some yi.
The reason for this is that, under the assumption adom(A) = ∅, we
can always replace β by
β ∧ ∃y∈adom (y − y = 0) ∧ (y − z = 0) ∨ ¬(y − z = 0) .
Putting the resulting formula in the prenex normal form fulﬁlls the conditions
listed in this item.
We now assume that αi(x, y, z), 1 ≤ i ≤ n, are FO(R) atomic formulae
pi(x, y, z)

=
>
ﬀ
0, and αi, n < i ≤ s, are FO(σ) atomic formulae. We let di
be the degree, in z, of pi. For each a, b, by pa,b
i (z) we denote the univariate
polynomial pi(a, b, z). Note that the degree of pa,b
i is at most di. We let d =
maxi di. Whenever we refer to the jth root of a univariate polynomial p,
we mean its jth real root in the usual ordering, if such a root exists, and 0
otherwise. Note that there exists an FO(R) formula rootj
p(x) which holds iﬀ
x is the jth root of p.
We now prove the following.
Lemma 13.20. Let ϕ(x) be as above, where the assumptions 1–4 hold. Let A
be such that adom(A) = ∅. Fix a tuple of real numbers a. Then (R, A) |= ϕ(a)
iﬀ there exist i, k ≤ n, and j, l ≤ d and two tuples b, c over adom(A) of length
|y|, such that
(R, A) |= β a,
ra,b
ij + ra,c
kl
2
∨ β a, ra,b
ij + 1 ∨ β a, ra,b
ij − 1 ,
where ra,b
ij is the jth root of pa,b
i and ra,c
kl is the kth root of pa,c
j .
Proof of Lemma 13.20. One direction is trivial: if there is a witness of a given
form, then there is a witness. For the other direction, assume that (R, A) |=
ϕ(a). We then must show that there exists a0 ∈ R of the form
ra,b
ij +ra,c
kl
2 or
ra,b
ij ± 1 such that (R, A) |= β(a, a0).
Let b1, . . . , bM be the enumeration of all the tuples of length |y| consisting
of elements of adom(A). Consider all univariate polynomials p
a,bj
i (z), and let
rijk be the kth root of p
a,bj
i (z), for k ≤ d. Let S be the family of all elements
of the form rijk, i ≤ n, j ≤ M, k ≤ d. It follows from our assumptions that
S = ∅ and adom(A) ⊆ S, since one of the polynomials is yi − z. We let rmin
and rmax be the minimum and the maximum elements of S, respectively.
Suppose (R, A) |= β(a, a0). If a0 ∈ S, then there is a polynomial pi, a
tuple b, and j ≤ d such that a0 = ra,b
ij . By selecting c = b, k = i, l = j, we see
that a0 is of the required form.
13.4 Restricted Quantiﬁer Collapse 263
Assume a0 ∈ S. There are three possible cases:
1. a0 < rmin, or
2. a0 > rmax, or
3. there exist r1, r2 ∈ S such that r1 < a0 < r2, and there is no other r ∈ S
with r1 < r < r2.
We claim that for every pi and every bj:
sign p
a,bj
i (a0) = sign p
a,bj
i (rmin − 1) in case 1
sign p
a,bj
i (a0)) = sign p
a,bj
i (rmax + 1) in case 2 (13.7)
sign p
a,bj
i (a0)) = sign p
a,bj
i
r1 + r2
2
in case 3.
Indeed, in the third case, suppose sign p
a,bj
i (a0) = sign p
a,bj
i (r1+r2
2 ) .
Then the interval [a0, r1+r2
2 ] contains a real root of p
a,bj
i (z), which then must
be in S. We conclude that there is an element of S between r1 and r2, a
contradiction. The other two cases are similar.
Let a1 be (rmin − 1) for case 1, (rmax + 1) for case 2, and r1+r2
2 for case 3.
Then for every tuple bj, j ≤ M, and every atomic formula αi, we have
αi(a, bj, a0) ↔ αi(a, bj, a1). (13.8)
This follows from (13.7) and the fact that FO(σ) atomic formulae may not
contain variable z.
We can now use (13.8) to conclude that β(a, a0) ↔ β(a, a1). Clearly,
the equivalence (13.8) propagates through Boolean combinations of formulae.
Furthermore, notice that if for a ﬁnite set A and m > 0, α(a, b, b, a0) ↔
α(a, b, b, a1) for every b ∈ A and every b ∈ Am
, then
(∃x ∈ A α(a, x, b, a0)) ↔ (∃x ∈ A α(a, x, b, a1))
for every b ∈ Am
. This shows that (13.8) propagates through active-domain
quantiﬁcation, and hence β(a, a0) ↔ β(a, a1).
Thus, if (R, A) |= β(a, a0), then (R, A) |= β(a, a1). Since a1 is of the right
form (either r − 1, or r + 1 for r ∈ S, or r+r′
2 for r, r′
∈ S), this concludes the
proof of the lemma.
To conclude the proof of the theorem, we note that Lemma 13.20 can be
translated into an FO deﬁnition as follows. For each FO(R) atomic formula
α(x, y, z), and for any two tuples u, v of the same length as y, we deﬁne the
following formulae:
264 13 Embedded Finite Models
• α
1/2
ikjl(x, y, u, v), for i, k ≤ n, j, l ≤ d, says that α(x, y, z) holds when z is
equal to
rx,u
ij +rx,v
kl
2 . That is,
∃z∃z1∃z2 rootj
[px,u
i ](z1) ∧ rootl
[px,v
k ](z2) ∧ (2z = z1 + z2) ∧ α(x, y, z) .
• α+
ij(x, y, u) for i ≤ n, j ≤ d, says that α(x, y, z) holds for z = rx,u
ij + 1;
that is,
∃z∃z1 rootj
[px,u
i ](z1) ∧ (z = z1 + 1) ∧ α(x, y, z) .
• α−
ij(x, y, u) for i ≤ n, j ≤ d, says that α(x, y, z) holds for z = rx,u
ij − 1;
the FO deﬁnition is similar to the one given above, except that we use a
conjunct z = z1 − 1.
Note that by quantiﬁer elimination for R, we may assume that all formulae
α
1/2
ikjl(x, y, u, v), α+
ij(x, y, u), and α−
ij(x, y, u) are quantiﬁer-free.
For i, k ≤ n, and j, l ≤ d, let γ
1/2
ikjl(x, y, u, v) be the Boolean combination
BC(αs) where each atomic FO(R) formula α is replaced by α
1/2
ikjl(x, y, u, v).
Let β
1/2
ijkl(x, u, v) be
Qy1 ∈adom . . . Qym ∈adom γ
1/2
ikjl(x, y, u, v).
Likewise, we deﬁne γ+
ij (x, y, u) to be the Boolean combination BC(αs) where
each atomic FO(R) formula α is replaced by α+
ij(x, y, u), and let β+
ij(x, u) be
γ+
ij (x, y, u) preceded by the quantiﬁer preﬁx of β. Finally, we deﬁne β−
ij (x, u)
as β+
ij(x, u), except by using formulae α−
ij(x, y, u).
Now Lemma 13.20 says that ∃z β(x, z) is equivalent to
∃u∈adom∃v∈adom
i,k≤n j,l≤d
β
1/2
ijkl(x, u, v) ∨ β+
ij (x, u) ∨ β−
ij (x, u) ,
which is an FOact(R, σ) formula.
This completes the proof of the translation for the case of structures A with
adom(A) = ∅. To deal with empty structures A, consider a formula ϕ(x), and
let ϕ′
(x) be an FO(R) formula obtained from ϕ(x) by replacing each atomic
FO(σ) subformula by false. Note that if adom(A) = ∅, then (R, A) |= ϕ(a) iﬀ
R |= ϕ′
(a). By quantiﬁer elimination, we may assume that ϕ′
is quantiﬁerfree.
Hence, ϕ is equivalent to
¬∃y∈adom(y = y) ∧ ϕ′
(x) ∨ ∃y∈adom(y = y) ∧ ϕact(x) , (13.9)
where ϕact is constructed by the algorithm for the case of nonempty structures.
Clearly, (13.9) will work for both empty and nonempty structures. Since (13.9)
is an FOact(R, σ) formula, this completes the proof.
13.5 The Random Graph and Collapse to MSO 265
Corollary 13.21. Every generic query in FO(R, σ) is expressible in
FOact( R, < , σ). In particular, every such query is local, and even is not
expressible in FO(R, σ).
What other structures have RQC? There are many known examples, some
of them presented as exercises at the end of the chapter. It follows immediately
from Theorem 13.19 that R, +, < has RQC. Another example is given by
R, +, ·, ex
, the expansion of the real ﬁeld with the function x → ex
. The ﬁeld
of complex numbers is known to have RQC, as well as several structures on
ﬁnite strings. See Exercises 13.10 – 13.14.
13.5 The Random Graph and Collapse to MSO
The real ﬁeld is a structure with a decidable theory. So is the structure Z =
Z, +, < , which also admits RQC (see Exercise 13.10). In fact both admit
quantiﬁer elimination: for Z, one has to add all the deﬁnable relations (x −
y) mod k = 0, as well as constant 1.
Could it be true that one can guarantee RQC for every structure M with
decidable theory? We give a negative answer here, which establishes a diﬀerent
kind of collapse: of FO(M, σ) to MSO under the active-domain semantics.
The structure is the random graph RG = U, E , introduced in Chap. 12.
This is any undirected graph on a countably inﬁnite set U that satisﬁes every
sentence that is true in almost all ﬁnite undirected graphs. Recall that the set
of all such sentences forms a complete theory with inﬁnite models, and that
this theory is decidable and ω-categorical.
The random graph satisﬁes the extension axioms EAn,m (12.2), for each
n ≥ m ≥ 0. These say that for every ﬁnite n-element subset S of U, and an
m-element subset T of S, there exists z ∈ S such that (z, x) ∈ E for all x ∈ T ,
and (z, x) ∈ E for all x ∈ S − T .
Recall that MSO (see Chap. 7), is a restriction of second-order logic in
which second-order variables range over sets. We deﬁne MSOact(M, σ) as MSO
over the vocabulary that consists of both Ω and σ, every ﬁrst-order quantiﬁer
is an active-domain quantiﬁer (i.e., ∃x∈adom or ∀x∈adom), and every MSO
quantiﬁer is restricted to the active domain. We write such MSO quantiﬁers
as ∃X ⊆ adom or ∀X ⊆ adom. The semantics is as follows: (M, A) |= ∃X ⊆
adom ϕ(X, ·) if for some set C ⊆ adom(A), it is the case that (M, A) |= ϕ(C, ·).
Theorem 13.22. For every σ,
FO(RG, σ) = MSOact(RG, σ).
Proof. The idea is to use the extension axioms to model MSO queries. Consider
an MSOact formula ϕ(x)
QX1 ⊆adom . . . QXm ⊆adom Qy1 ∈adom . . . Qyn ∈adom α(X, x, y),
266 13 Embedded Finite Models
where the Xi’s are second-order variables, the yj’s are ﬁrst-order variables,
and α is a Boolean combination of σ- and RG-formulae in variables x, y, and
formulae Xi(xj) and Xi(yj). Construct a new FO(RG, σ) formula ϕ′
(x) by
replacing each QXi ⊆ adom with Qzi ∈ adom ∪ x (which is FO-deﬁnable),
and changing every atomic subformula Xi(u) to E(zi, u). In other words, a
subset Xi of the active domain is identiﬁed by an element zi from which there
are edges to all elements of Xi, and no edges to the elements of the active
domain which do not belong to Xi. It is then easy to see, from the extension
axioms, that ϕ′
is equivalent to ϕ. Hence, MSOact(RG, σ) ⊆ FO(RG, σ).
For the other direction, proceed by induction on the FO(RG, σ) formulae.
The only nontrivial case is that of unrestricted existential quantiﬁcation.
Suppose we have an MSOact(RG, σ) formula
ϕ(x, z) ≡ QX ⊆adom Qy∈adom α(X, x, y, z),
where x = (x1, . . . , xn), and α again is a Boolean combination of atomic σand
RG-formulae, as well as formulae Xi(u), where u is one of the ﬁrst-order
variables z, x, y. We want to ﬁnd an MSOact formula equivalent to ∃z ϕ.
Such a formula is a disjunction of the form
∃z ∈adom ϕ ∨
i
ϕ(x, xi) ∨ ∃z ∈ adom ϕ.
Both ∃z ∈ adom ϕ and ϕ(x, xi) are MSOact(RG, σ) formulae. To eliminate
z from ∃z ∈ adom ϕ, all we have to know about z is its connections to x
and to the active domain in the random graph; the former is taken care of
by a disjunction listing all subsets of {1, . . . , n}, and the latter by a secondorder
quantiﬁer over the active domain. For I ⊆ {1, . . . , n}, let χI(x) be a
quantiﬁer-free formula saying that no xi, xj with i ∈ I, j ∈ I, could be equal.
We introduce a new second-order variable Z and deﬁne an MSOact formula
ψ(x) as
∃Z ⊆adom
I⊆{1,...,n}
χI(x) ∧ QX ⊆adom Qy∈adom αZ
I (X, Z, x, y) ,
where αZ
I (X, Z, x, y) is obtained from α by:
1. replacing each E(z, xi) by true for i ∈ I and false for i ∈ I,
2. replacing each E(z, yj) by Z(yj), and
3. replacing each Xi(z) by false.
The extension axioms then ensure that ψ is equivalent to ∃z ∈ adom ϕ.
The active-generic collapse, as it turns out, can be extended to MSO.
Proposition 13.23. Every generic query in MSOact(RG, σ) is expressible in
MSO over σ-structures.
13.6 An Application: Constraint Databases 267
Proof. First, we notice that there exists an inﬁnite subset Z of RG such that
for every pair a, b ∈ Z, there is no edge between a and b (such a subset is easy
to construct using one of the concrete representations of the random graph).
Next, we show by induction on the formulae that for every MSOact(RG, σ)
formula ϕ(X, x) and every inﬁnite set Z′
⊆ Z, there is an inﬁnite set Z′′
⊆ Z
and an MSO formula ϕ′
(X, x) of vocabulary σ such that for every σ-structure
A, and an interpretation of x, X as c, C over adom(A),
(RG, A) |= ϕ(C, c) ↔ ϕ′
(C, c).
Indeed, atomic formulae E(x, y) can be replaced by false. The rest of the
proof is exactly the same as the proof of Lemma 13.15: the active-domain
MSO quantiﬁers are handled exactly as the active-domain FO quantiﬁers.
Next, the same proof as in Lemma 13.16 shows that if ϕ deﬁnes a generic
query, then it is equivalent to ϕ′
over all σ-structures. This proves the propo-
sition.
Corollary 13.24. The class of generic queries expressible in FO(RG, σ) is
precisely the class of queries deﬁnable in MSO over σ-structures.
Thus, RG provides an example of a structure with quantiﬁer elimination
and decidable ﬁrst-order theory (see Exercise 12.8) that does not admit RQC,
but at the same time, one can establish meaningful bounds on the expressiveness
of queries. For example, each generic query in FO(RG, σ) can be evaluated
in PH, and string languages deﬁnable in FO(RG, σ) are precisely the regular
languages.
13.6 An Application: Constraint Databases
The framework of constraint databases can be described formally as the logic
FO(M, σ), where each m-relation S in σ is interpreted not as a ﬁnite set, but
as a deﬁnable subset of Um
. That is, there is a formula αS(x1, . . . , xm) of
FO(M) such that S is the set {a | M |= αS(a)}.
The main application of constraint databases is in querying spatial information.
The key idea of constraint databases is that regions are represented
by FO formulae over some underlying structure: typically either the real ﬁeld
R, or Rlin = R, +, −, 0, 1, < . That is, they are described by polynomial or
linear constraints over the reals.
To illustrate how linear constraints can be used to describe a speciﬁc spatial
database, consider the following example, representing an approximate
map of Belgium (a real map will have many more constraints, but the basic
ideas are the same). Fig. 13.1 shows the map itself, while Fig. 13.2 shows how
regions and cities are described by constraints.
One can then use FO(R, σ) or FO(Rlin, σ) to query those databases as
if they were usual relational databases that store inﬁnitely many points. For
268 13 Embedded Finite Models
Liège
Charleroi
Brussels
Bruges
Bastogne
Hasselt
Antwerp
Walloon Region
Brussels Region
Flemish Region
23222120191817161514131211101 2 3 4 5 6 7 8 9
6
17
16
15
14
13
12
11
10
9
8
1
2
3
4
5
7
Fig. 13.1. Spatial information map of Belgium
example, to ﬁnd all points in the Walloon region that are east of Hasselt one
would write
ϕ(x, y) = Walloon(x, y) ∧ ∃u, v Hasselt(u, v) ∧ x > u . (13.10)
To ﬁnd all the points in the Walloon region that are on the direct line
from Hasselt to Li`ege, one writes a formula ϕ(x, y) as the conjunction of
Walloon(x, y) and
∃u, v, s, t, λ




Hasselt(u, v) ∧ Li´ege(s, t)
∧ 0 ≤ λ ∧ λ ≤ 1
∧ x = λu + (1 − λ)s
∧ y = λv + (1 − λ)t



 . (13.11)
In these examples, (13.10) is an FO( R, < , σ) query, while (13.11) needs
to be expressed in the more expressive language FO(R, σ).
We now give one simple application of embedded ﬁnite models to constraint
databases. A basic property of regions is their topological connectivity.
Most regions represented in geographical databases are connected (and the few
examples of unconnected ones to be rather well known, as they usually lead
to nasty political problems). But can we test this property in FO-based query
13.6 An Application: Constraint Databases 269
Cities
Name Geometry
Antwerp (x = 10)∧
(y = 16)
Bastogne (x = 19)∧
(y = 6)
Bruges (x = 5)∧
(y = 16)
Brussels (x = 10.5)∧
(y = 12.5)
Charleroi (x = 10)∧
(y = 8)
Hasselt (x = 16)∧
(y = 14)
Li`ege (x = 17)∧
(y = 11)
Regions
Name Geometry
Brussels (y ≤ 13) ∧ (x ≤ 11)∧
(y ≥ 12) ∧ (x ≥ 10)
Flanders (y ≤ 17) ∧ (5x − y ≤ 78)∧
(x − 14y ≤ −150)∧
(x + y ≥ 45)∧
(3x − 4y ≥ −53)∧
(¬((y ≤ 13) ∧ (x ≤ 11)∧
∧(y ≥ 12) ∧ (x ≥ 10)))
Walloon ((x − 14y ≥ −150) ∧ (y ≤ 12)∧
(19x + 7y ≤ 375)∧
(x − 2y ≤ 15) ∧ (x ≥ 13)∧
(5x + 4y ≥ 89)) ∨
((3y − x ≥ 5) ∧ (x + y ≥ 45)∧
(x − 14y ≥ −150) ∧ (x ≥ 13))
Fig. 13.2. A spatial database of Belgium
languages? We now give a simple proof of the negative answer, by reduction
to collapse results.
Theorem 13.25. Topological connectivity is not expressible in FO(R, σ).
Proof. Assume, to the contrary, that topological connectivity of sets in R3
is
deﬁnable (one can show that connectivity of sets on the plane is undeﬁnable
as well; the proof involves a slightly more complicated reduction and is the
subject of Exercise 13.5). We show that graph connectivity is then deﬁnable.
Suppose we have a ﬁnite undirected graph G with adom(G) ⊂ R. For
each edge (a, b) in G, we deﬁne the segment s(a, b) in R3
between (a, 1, 0)
and (0, 0, b). Each point in s(a, b) is of the form (λa, λ, (1 − λ)b) for some
0 ≤ λ ≤ 1. Note that this implies that s(a, b) ∩ s(c, d) = ∅ can only happen if
a = c or b = d, since (λa, λ, (1 − λ)b) = (µc, µ, (1 − µ)d) implies λ = µ and
thus for λ = 0, 1 we have a = c and b = d, for λ = 0 we get b = d, and for
λ = 1 we get a = c.
Now we encode each edge (a, b) by the set e(a, b) = s(a, b) ∪ s(b, a) ∪
s(a, a) ∪ s(b, b) (see Fig. 13.3). Note that e(a, b) is a connected set, and that
e(a, b) ∩ e(c, d) = ∅ iﬀ the edges (a, b) and (c, d) have a common node.
We then deﬁne a new set XG in R3
as
XG =
(a,b)∈G
e(a, b).
It follows that XG is topologically connected iﬀ G is connected as a graph.
Since the transformation G → XG is deﬁnable in FO(R, σ), the assumption
270 13 Embedded Finite Models
a
b
1
a b
Fig. 13.3. Embedding an edge (a, b) into R3
that topological connectivity is deﬁnable implies that so is graph connectivity.
However, we know from Corollary 13.21 that graph connectivity cannot be
expressed. This contradiction proves the theorem.
13.7 Bibliographic Notes
The framework of embedded ﬁnite models originated in database theory, in
connection with attempts to understand query languages that use interpreted
operations, as well as query languages for constraint databases. Constraint
databases were introduced by Kanellakis, Kuper, and Revesz [142] (see also
the surveys by Kuper, Libkin, and Paredaens [158], Libkin [168], and Van den
Bussche [242]).
Soon after [142] was published, it became clear that many questions about
languages for constraint databases reduce to questions about embedded ﬁnite
models. For example, Grumbach and Su [115] present many reductions to the
ﬁnite case.
Collapse results as a technique for proving bounds on FO(M, σ) were introduced
by Paredaens, Van den Bussche, and Van Gucht [197], where the
restrcited-quantiﬁer collapse for Rlin was proved. The collapse for the real
ﬁeld was shown by Benedikt and Libkin [19] (in fact the proof in [19] applies
to a larger class of o-minimal structures; see [243]). The active-generic collapse
was shown by Otto and Van den Bussche [193]; the proof given here follows
[19]. For the basics of Ramsey theory, see Graham, Rothschild, and Spencer
[103]. The collapse to MSO over the random graph is from [168], although one
direction was proved earlier by [193].
13.8 Exercises 271
Inexpressibility of connectivity by reduction to the ﬁnite case was ﬁrst
shown in [115]; for a diﬀerent approach that characterizes topological properties
expressible in FO(R, {S}), where S is binary, see Kuijpers, Paredaens, and
Van den Bussche [157]. For a study of these problems over complex numbers,
we refer to Chapuis and Koiran [36]. See also Exercise 13.6.
Although we said in the beginning of the chapter that no collapse results
were proved with the help of Ehrenfeucht-Fra¨ıss´e games, results by Fournier
[83] show how to use games to establish bounds on the quantiﬁer rank for
expresssing certain properties over embedded ﬁnite models. An example is
presented in Exercise 13.8.
In this chapter we used a number of well-known results in classical model
theory, such as decidability and quantiﬁer elimination for the real ﬁeld R (see
Tarski [229]) and undecidability of the FO theory of Q, +, · (see Robinson
[206]).
Sources for exercises:
Exercise 13.4: Benedikt and Libkin [19]
Exercise 13.6: Chapuis and Koiran [36]
Exercise 13.7: Grumbach and Su [115]
Exercise 13.8: Fournier [83]
Exercise 13.9: Hull and Su [127]
Exercise 13.10: Flum and Ziegler [82]
(see also [168] for a self-contained proof)
Exercise 13.11: Benedikt and Libkin [19]
Exercise 13.12: Flum and Ziegler [82]
Exercise 13.13: Barrington et al. [15]
Exercises 13.14–13.16: Benedikt et al. [21]
13.8 Exercises
Exercise 13.1. Give an example of a noncomputable query expressible in FO(N, σ).
Exercise 13.2. Prove that it is undecidable if a query expressible in FO(M, σ) is
generic (even if the theory of M is decidable).
Exercise 13.3. Suppose that S is a binary relation symbol, and R is a ternary one,
and both are interpreted as sets deﬁnable over the real ﬁeld R = R, +, ·, 0, 1, < .
Show how to express the following in FO(R, {S, R}):
• S is a graph of a function f : R → R;
• S is a graph of a continuous function f : R → R;
• S is a graph of a diﬀerentiable function f : R → R;
• R is a trajectory of an object: that is, a triple (x, y, t) ∈ R gives a position (x, y)
at time t;
• a formula ϕ(x, y, v) which holds iﬀ v is the speed of the object at time t (assuming
that R deﬁnes a trajectory).
272 13 Embedded Finite Models
Exercise 13.4. Prove a generalization of the Ramsey property (i.e., each activesemantics
sentence expressing a generic query can be written using just the order
relation) for SO, ∃SO, FO(Cnt), and a ﬁxed point logic of your choice. Also prove
that Lω
∞ω does not have such a generalized Ramsey property.
Exercise 13.5. Use a reduction diﬀerent from the one in the proof of Theorem 13.25
to show that topological connectivity of subsets of R2
is not deﬁnable in FO(R, {S}),
where S is binary.
Exercise 13.6. Prove that topological connectivity of subsets of C2
which are deﬁnable
in C, +, −, ·, 0, 1 cannot be expressed in FO( C, +, −, ·, 0, 1 , {S}), where S
is binary.
Exercise 13.7. Prove that if S and S′
are interpreted as subsets of R2
deﬁnable in
R, then none of the following is expressible in FO(R, {S, S′
}):
• S contains at least one hole (assuming S is a closed set).
• S has a Eulerian traversal. That is, if S is a union of line segments, then it has
a traversal going through each line segment exactly once.
• S and S′
are homeomorphic.
Use reductions to the ﬁnite case for all three problems.
Exercise 13.8. Show that in FO(R, σ) one can express even for sets of cardinality
up to n using a sentence of quantiﬁer rank O(
√
log n).
Exercise 13.9. Prove the natural-active collapse for U∅ = U, ∅ .
Exercise 13.10. Prove the restricted quantiﬁer collapse for Z, +, < .
Exercise 13.11. An ordered structure M = U, Ω, < is called o-minimal if
every deﬁnable subset of U is a ﬁnite union of points and open intervals
(a, b), (−∞, a), (a, ∞).
Prove the restricted quantiﬁer collapse for an arbitrary o-minimal structure.
Hint: you will need the following uniform bounds result of Pillay and Steinhorn
[198]. If ϕ(x, y) is an FO(M) formula, then there exists a constant k such that
for every b, the set {a | M |= ϕ(a, b)} is a union of fewer than k points and open
intervals.
One can use this result to infer that R, +, ·, ex
admits the restricted quantiﬁer
collapse, since Wilkie [248] proved that it is o-minimal.
Exercise 13.12. We say that a structure M has the ﬁnite cover property if there is
a formula ϕ(x, y) such that for every n > 0, one can ﬁnd tuples a1, . . . , an such that
∃x
V
j=i ϕ(x, aj) holds for each i ≤ n, but ∃x
V
j≤n ϕ(x, aj) does not hold.
• Prove that if M does not have the ﬁnite cover property, then it admits the
restricted quantiﬁer collapse.
• Conclude that C, +, · and N, succ admit the restricted quantiﬁer collapse.
13.8 Exercises 273
Exercise 13.13. We say that a language L ⊆ Σ∗
has a neutral letter if there exists
a ∈ Σ such that for every two strings s, s′
∈ Σ∗
, we have s · s′
∈ L iﬀ s · a · s′
∈ L.
Now let Ω be a set of arithmetic predicates. We say that a language L is FO(Ω)deﬁnable
if there is an FO sentence ΦL of vocabulary σΣ ∪ Ω such that MΩ
s |= ΦL
iﬀ s ∈ L. Here MΩ
s is the structure Ms expanded with the interpretation of Ωpredicates
on its universe.
The following statement is known as the Crane Beach conjecture for Ω: if L is
FO(Ω)-deﬁnable and has a neutral letter, then it is star-free.
• Use Exercise 13.10 to prove that the Crane Beach conjecture is true when Ω =
{+++} (the graph of the addition operation).
• Prove that the Crane Beach conjecture is false when Ω = {+++,×××} (hint: use
Theorem 6.12).
Exercise 13.14. Consider the structure Σ∗
, ≺, (fa)a∈Σ , where ≺ is the preﬁx
relation, and fa : Σ∗
→ Σ∗
is deﬁned by fa(x) = x · a. Prove that this structure
has the restricted quantiﬁer collapse. Prove that it still has the restricted quantiﬁer
collapse when augmented with the following:
• The predicate PL, for each regular language L, that is true of s iﬀ s is in L.
• The functions ga : Σ∗
→ Σ∗
deﬁned by ga(x) = a · x.
Exercise 13.15. Suppose S is an inﬁnite set, and C ⊆ 2S
is a family of subsets of
S. Let F ⊂ S be ﬁnite; we say that C shatters F if the collection {F ∩ C | C ∈ C}
is ℘(F), the powerset of F. The Vapnik-Chervonenkis (VC) dimension of C is the
maximal cardinality of a ﬁnite set shattered by C. If arbitrarily large ﬁnite sets are
shattered by C, we let the VC dimension be ∞.
If M is a structure and ϕ(x, y) is an FO(M) formula, with | x |= n, | y |= m,
then for each a ∈ Un
, we deﬁne ϕ(a, M) = {b ∈ Um
| M |= ϕ(a, b)}, and let Fϕ(M)
be {ϕ(a, M) | a ∈ Un
}. Families of sets arising in such a way are called deﬁnable
families. We say that M has ﬁnite VC dimension if every deﬁnable family in M has
ﬁnite VC dimension.
Prove that if M admits the restricted quantiﬁer collapse, then it has ﬁnite VC
dimension.
Exercise 13.16. Consider an expansion M of Σ∗
, ≺, (fa)a∈Σ with the predicate
el(x, y) which is true iﬀ |x |=|y |. We have seen this structure in Chap. 7 (Exercise
7.20); it deﬁnes precisely the regular relations.
Prove that FO(M, σ) cannot express even.
Exercise 13.17.∗
For the structure M of Exercise 13.16, is FOgen
act (M, σ) contained
in FOact(U<, σ)?
14
Other Applications of Finite Model Theory
In this ﬁnal chapter, we brieﬂy outline three diﬀerent application areas of
ﬁnite model theory. In mathematical logic, ﬁnite models are used as a tool for
proving decidability results for satisﬁability of FO sentences. In the area of
temporal logics and veriﬁcation, one analyzes the behavior of certain logics on
some special ﬁnite structures (Kripke structures). And ﬁnally, it was recently
discovered that many constraint satisfaction problems can be reduced to the
existence of a homomorphism between two ﬁnite structures.
14.1 Finite Model Property and Decision Problems
The classical decision problem in mathematical logic is the satisﬁability problem
for FO sentences: that is,
Given a ﬁrst-order sentence Φ, does it have a model?
We know that in general, satisﬁability is undecidable. However, a complete
classiﬁcation of decidable fragments in terms of quantiﬁer-preﬁx classes exists.
For the rest of the section, we assume that the vocabulary is purely relational.
We have already seen classes of formulae deﬁned by their quantiﬁer preﬁxes
in Sect. 12.4. For a regular expression r over the alphabet {∃, ∀}, we denote
by FO(r) the set of all prenex sentences
Q1x1 . . . Qnxn ϕ(x1, . . . , xn),
where the string Q1 . . . Qn is in the language denoted by r. Here, each Qi is
either ∃ or ∀, and ϕ is quantiﬁer-free.
It is known that there are precisely two maximal preﬁx classes for which the
satisﬁability problem is decidable: these are FO(∃∗
∀∗
) (known as the BernaysSch¨onﬁnkel
class), and FO(∃∗
∀∃∗
) (known as the Ackermann class).
The proof technique in both cases relies on the following property.
276 14 Other Applications of Finite Model Theory
Deﬁnition 14.1. We say that a class K of sentences has the ﬁnite model
property if for every sentence Φ in K, either Φ is unsatisﬁable, or it has a
ﬁnite model.
In other words, in a class K that has the ﬁnite model property, every
satisﬁable sentence has a ﬁnite model.
It turns out that both FO(∃∗
∀∗
) and FO(∃∗
∀∃∗
) have the ﬁnite model
property, and, furthermore, there is an upper bound on the size of a ﬁnite
model of Φ in terms of Φ , the size of Φ. We prove this for the BernaysSch¨onﬁnkel
class.
Proposition 14.2. If Φ is a satisﬁable sentence of FO(∃∗
∀∗
), then it has a
model whose size is at most linear in Φ .
Proof. Let Φ be
∃x1 . . . ∃xn∀y1 . . . ∀ym ϕ(x, y),
where ϕ is quantiﬁer-free. Let ψ(x) be ∀y ϕ(x, y).
Since Φ is satisﬁable, it has a model A. Let a1, . . . , an witness the existential
quantiﬁers: that is, A |= ψ(a). Let A′
be the ﬁnite substructure of A whose
universe is {a1, . . . , an}. Since ψ is a universal formula, it is preserved under
taking substructures. Hence, A′
|= ψ(a), and therefore, A′
|= Φ. Thus, we have
shown that Φ has a model whose universe has at most n elements.
This immediately gives us the decision procedure for the class FO(∃∗
∀∗
):
given a sentence Φ with n existential quantiﬁers, look at all nonisomorphic
structures whose universes are of size up to n, and check if any of them is a
model of Φ. This algorithm also suggests a complexity bound: one can guess
a structure A with |A| ≤ n, and check if A |= Φ. Notice that in terms of Φ ,
the size of such a structure could be exponential. For each relation symbol
R of arity m, there could be up to nm
diﬀerent tuples in RA
. Since there
is no a priori bound on the arity of R, it may well depend on Φ , which
gives us an exponential upper bound on A . Hence, the algorithm runs in
nondeterministic exponential time.
It turns out that one cannot improve this bound.
Theorem 14.3. The satisﬁability problem for FO(∃∗
∀∗
) is Nexptime-
complete.
If we have a vocabulary of bounded arity (i.e., there is a constant k such
that every relation symbol has arity at most k), then the size of a structure
on n elements is at most polynomial in n. Thus, in this case one has to check
if A |= ϕ, where A is polynomial in n. As we know from the results
on the combined complexity of FO, this can be done in Pspace. Hence, for
a vocabulary of bounded arity, the satisﬁability problem for FO(∃∗
∀∗
) is in
Pspace.
14.1 Finite Model Property and Decision Problems 277
We now see an application of this decidability result in database theory. In
Chap. 6, we studied conjunctive queries: those of the form ∃xϕ, where ϕ is a
conjunction of atomic formulae. We also saw (Exercise 6.19) that containment
of conjunctive queries is NP-complete.
Another class of queries often used in database theory is unions of conjunctive
queries; that is, queries of the form Q1 ∪ . . . ∪ Qm, where each Qi
is a conjunctive query. Can the decidability of containment be extended to
union of conjunctive queries? That is, is it decidable whether Q(A) ⊆ Q′
(A)
for all A, when Q and Q′
are unions of conjunctive queries? We now give the
positive answer using the decidability of the Bernays-Sch¨onﬁnkel class.
Putting all existential quantiﬁers in front, we can assume without loss of
generality that Q is given by ϕ(x) ≡ ∃y α(x, y), and Q′
by ψ(x) ≡ ∃y β(x, y),
where α and β are monotone Boolean combinations of atomic formulae. Our
goal is to check whether Φ ≡ ∀x (ϕ(x) → ψ(x)) is a valid sentence.
Assuming that y and z are distinct variables, we can rewrite Φ as
∀x ∀y ∃z ¬α(x, y) ∨ β(x, z) .
We know that Φ is valid iﬀ ¬Φ is not satisﬁable. But ¬Φ is equivalent to
∃x ∃y ∀z α ∧ ¬β ; that is, to an FO(∃∗
∀∗
) sentence. This gives us the fol-
lowing.
Proposition 14.4. Fix a relational vocabulary σ. Let Q and Q′
be unions
of conjunctive queries over σ. Then testing whether Q ⊆ Q′
is decidable in
Pspace.
The complexity bound given by the reduction to the Bernays-Sch¨onﬁnkel
class is not the optimal one, but it is not very far oﬀ: for a ﬁxed vocabulary
σ, the complexity of containment of unions of conjunctive queries is known to
be Πp
2 -complete.
We now move to the Ackermann class FO(∃∗
∀∃∗
). Again, we have the
ﬁnite model property.
Theorem 14.5. Let Φ be an FO(∃∗
∀∃∗
) sentence. If Φ is satisﬁable, then it
has a model whose size is at most exponential in Φ .
Even though the size of the ﬁnite model jumps from linear to exponential,
the complexity of the decision problem does not get worse, and in fact in some
cases the problem becomes easier.
Theorem 14.6. The satisﬁability problem for FO(∃∗
∀∃∗
) is Nexptimecomplete.
Furthermore, when restricted to sentences that do not mention
equality, the problem becomes Exptime-complete.
Finally, we consider ﬁnite variable restrictions of FO. Recall that FOk
refers to the fragment of FO that consists of formulae in which at most k
distinct variables are used.
278 14 Other Applications of Finite Model Theory
green
¬red
¬yellow
¬red
¬green
yellow
red
¬green
¬yellow
Fig. 14.1. An example of a Kripke structure
Theorem 14.7. FO2
has the ﬁnite model property: each satisﬁable FO2
sentence
has a ﬁnite model whose size is at most exponential in Φ . Furthermore,
the satisﬁability problem for FO2
is Nexptime-complete. The satisﬁability
problem for FOk
, k > 2, is undecidable.
14.2 Temporal and Modal Logics
In this section, we look at logics that are used in verifying temporal properties
of reactive systems. The ﬁnite structure in this case is usually a transition
system, or a Kripke structure. It can be viewed as a labeled directed graph,
where the nodes describe possible states the system could be in, and the edges
indicate when a transition from one state to another is possible. To describe
possible states of the system, one uses a collection of propositional variables,
and speciﬁes which of them are true in a given state.
An example of a Kripke structure is given in Fig. 14.1. We have three
propositional variables, red, green, and yellow. The states are those in which
only one variable is true, and the other two are false. As expected, from a red
light one can go to green, from green to yellow, and from yellow to red, and
the system can stay in any of these states.
Sometimes edges of Kripke structures are labeled too, but since it is easy
to push those labels back into the states, we shall assume that edges are not
labeled.
Thus, formally, a Kripke structure, for a ﬁnite alphabet Σ, is a ﬁnite
structure K = S, E, (Pa)a∈Σ , where S is the set of states, E is a binary
relation on S, and for each a ∈ Σ, Pa is a unary relation on S, i.e., a subset
of S. Since assigning relations Pa can be viewed as labeling states with letters
from Σ, we shall also refer to the labeling function λ : S → 2Σ
, given by
14.2 Temporal and Modal Logics 279
λ(s) = {a ∈ Σ | s ∈ Pa}.
We now deﬁne the simplest of the logics we deal with in this section:
the propositional modal logic, ML. Its formulae are given by the following
grammar:
ϕ, ψ ::= a (a ∈ Σ) | ϕ ∧ ψ | ¬ϕ | ϕ | ♦ϕ. (14.1)
The semantics of ML formulae is given with respect to a Kripke structure K
and a state s. That is, each formula deﬁnes a set of states where it holds. The
formal deﬁnition of the semantics is as follows:
• (K, s) |= a, a ∈ Σ iﬀ a ∈ λ(s);
• (K, s) |= ϕ ∧ ψ iﬀ (K, s) |= ϕ and (K, s) |= ψ;
• (K, s) |= ¬ϕ iﬀ (K, s) |= ϕ;
• (K, s) |= ϕ iﬀ (K, s′
) |= ϕ for all s′
such that (s, s′
) ∈ E;
• (K, s) |= ♦ϕ iﬀ (K, s′
) |= ϕ for some s′
such that (s, s′
) ∈ E.
Thus, is the “for all” modality, and ♦ is the “there exists” modality: ϕ
(♦ϕ) means that ϕ holds in every (in some) state to which there is an edge
from the current state.
Notice also that ♦ is superﬂuous since ♦ϕ is equivalent to ¬ ¬ϕ.
ML can be translated into FO as follows. For each ML formula ϕ, we
deﬁne an FO formula ϕ◦
(x) such that (K, s) |= ϕ iﬀ K |= ϕ◦
(s). This is done
as follows:
• a◦
≡ Pa(x);
• (ϕ ∧ ψ)◦
≡ ϕ◦
∧ ψ◦
;
• (¬ϕ)◦
≡ ¬ϕ◦
;
• ( ϕ)◦
≡ ∀y R(x, y) → ∀x x = y → ϕ◦
(x) .
For the translation of ϕ, we employed the technique of reusing variables that
was central in Chapter 11. Thus, ϕ◦
is always an FO2
formula, as it uses only
two variables: x and y. Summing up, we obtained the following.
Proposition 14.8. Every formula of the propositional modal logic ML is
equivalent to an FO2
formula. Consequently, every satisﬁable formula ϕ of
ML has a model which is at most exponential in ϕ .
The expressiveness of ML is rather limited; in particular, since it is a
fragment of FO, it cannot express reachability properties which are of utmost
importance in verifying properties of ﬁnite-state systems. We thus move to
more expressive logics, LTL and CTL.
280 14 Other Applications of Finite Model Theory
The formulae of the linear time temporal logic, LTL, are given by the
following grammar:
ϕ, ϕ′
::= a (a ∈ Σ) | ¬ϕ | ϕ ∧ ϕ′
| Xϕ | ϕUϕ′
. (14.2)
The formulae of the computation tree logic, CTL, are given by
ϕ, ϕ′
::= a (a ∈ Σ) | ¬ϕ | ϕ ∧ ϕ′
|
EXϕ | AXϕ | E(ϕUϕ′
) | A(ϕUϕ′
).
(14.3)
In both of these logics, we talk about properties of paths in the Kripke
structure. A path in K is an inﬁnite sequence of nodes π = s1s2 . . . such that
(si, si+1) ∈ E for all i. Of course, in a ﬁnite structure, some of the nodes must
occur inﬁnitely often on a path.
The connective X means “next time”, or “for the next node on the path”.
The connective U is “until”: ϕ holds until some point where ϕ′
holds. E is
the existential quantiﬁer “there is a path”, and A is the universal quantiﬁer:
“for all paths”.
To give the formal semantics, we introduce a logic that subsumes both
LTL and CTL. This logic, denoted by CTL∗
, has two kinds of formulae: state
formulae denoted by ϕ, and path formulae denoted by ψ. These are given by
the following two grammars:
ϕ, ϕ′
::= a (a ∈ Σ) | ¬ϕ | ϕ ∧ ϕ′
| Eψ | Aψ
ψ, ψ′
::= ϕ | ¬ψ | ψ ∧ ψ′
| Xψ | ψUψ′
.
(14.4)
The semantics of a state formula is again given with respect to a Kripke
structure K and a state s. The semantics of a path formula ψ is given with
respect to K and a path π in K. If π = s1s2s3 . . ., we shall write πk
for the
path starting at sk; that is, sksk+1 . . ..
Formally, we deﬁne the semantics as follows:
• (K, s) |= a, a ∈ Σ iﬀ a ∈ λ(s);
• (K, s) |= ϕ ∧ ϕ′
iﬀ (K, s) |= ϕ and (K, s) |= ϕ′
;
• (K, s) |= ¬ϕ iﬀ (K, s) |= ϕ;
• (K, s) |= Eψ iﬀ there is a path π = s1s2 . . . such that s1 = s and (K, π) |=
ψ;
• (K, s) |= Aψ iﬀ for every path π = s1s2 . . . such that s1 = s, we have
(K, π) |= ψ;
• if ϕ is a state formula, and π = s1s2 . . ., then (K, π) |= ϕ iﬀ (K, s1) |= ϕ;
• (K, π) |= ψ ∧ ψ′
iﬀ (K, π) |= ψ and (K, π) |= ψ′
;
• (K, π) |= ¬ψ iﬀ (K, π) |= ψ;
• (K, π) |= Xψ iﬀ (K, π2
) |= ψ;
14.2 Temporal and Modal Logics 281
• (K, π) |= ψUψ′
if there exists k ≥ 1 such that (K, πk
) |= ψ′
and (K, πi
) |= ψ
for all i < k.
Note that LTL formulae are path formulae, and CTL formulae are state
formulae. LTL formulae are typically evaluated along a single inﬁnite path
(hence the name linear temporal logic). On the other hand, CTL is well-suited
to describe branching processes (hence the name computation tree logic). If
we want to talk about an LTL formula ψ being true in a given state of a
Kripke structure, we shall mean that the formula Aψ is true in that state.
Some derived formulae are often useful in describing temporal properties.
For example, Fψ ≡ trueUψ, means “eventually”, or sometime in the future,
ψ holds, and Gψ ≡ ¬F¬ψ means “always”, or “globally”, ψ holds (true itself
can be assumed to be a formula in any of the logics: for example, a ∨ ¬a).
Thus, AGψ means that ψ holds along every path starting from a given state,
and EFψ means that along some path, ψ eventually holds.
For the example in Fig. 14.1, consider a CTL formula AG(yellow →
AFgreen), saying that if the light is yellow, it will eventually become green.
This formula is actually false in the structure shown in Fig. 14.1, since
yellow can continue to hold indeﬁnitely long due to the loop. However,
AG yellow → (AGyellow∨AFgreen) , saying that either yellow holds forever
or eventually changes to green, is true in that structure.
The main diﬀerence between CTL and LTL is that CTL is better suited
for talking about branching paths that start in a given node (this is the
reason logics like CTL are sometimes referred to as branching-time logics),
while LTL, on the other hand, is better suited for talking about properties of
a single path starting in a given node (and thus one speaks of a linear-time
logic). For example, consider the CTL formula AG(EFa). It says that along
every path from a given node, from every node there is a path that leads to a
state labeled a. It is known that this formula is not expressible in LTL. The
formula A(FGa), saying that on every path, starting from some node a will
hold forever, is a state formula resulting by applying the A quantiﬁer to the
LTL formula FGa; this formula is not expressible in CTL.
While all the examples seen so far could have been speciﬁed in other logics
used in this book – for example, MSO or LFP – the main advantage of these
temporal logics is that the model-checking problem for them can be solved
eﬃciently. The model-checking problem is to determine whether (K, s) |= ϕ,
for some Kripke structure K, state s, and a formula ϕ. The data complexity
for CTL∗
and its sublogics can easily be seen to be polynomial (since CTL∗
formulae can be expressed in LFP), but it turns out that the situation is much
better than this.
Theorem 14.9. The model-checking problem for ML, LTL, CTL, and CTL∗
is ﬁxed-parameter linear. For logics ML and CTL it can be solved in time
O( ϕ · K ) and for LTL and CTL∗
, the bound is 2O( ϕ )
· K .
282 14 Other Applications of Finite Model Theory
We illustrate the idea of the proof for the case of ML. Suppose we have a
formula ϕ and a Kripke structure K. Consider all the subformulae ϕ1, . . . , ϕk
of ϕ listed in an order that ensures that if ϕj is a subformula of ϕi, then j < i.
The algorithm then inductively labels each state s of K with either ϕi or ¬ϕi,
depending on which formula holds in that state. For the base case, there is
nothing to do since the states are already labeled with either a or ¬a for each
a ∈ Σ. For the induction, the only nontrivial case is when ϕi ≡ ϕj for some
j ≤ i. Then for each state s, we check all the states s′
with (s, s′
) ∈ E, and
see if all such s′
have been labeled with ϕj in the jth step: if so, we label s
by ϕi; if not, we label it by ¬ϕi. This algorithm can be implemented in time
O( ϕ · K ).
Next, we look at the connection between temporal and modal logics and
other logics for ﬁnite structures we have seen. We already mentioned that ML
can be embedded into FO2
. What about LTL? We can answer this question
for a simple kind of Kripke structures used in Chap. 7: these are structures of
the vocabulary σΣ = (<, (Pa)a∈Σ), used to represent strings.
Theorem 14.10. Over ﬁnite strings viewed as structures of vocabulary σΣ,
LTL and FO are equally expressive: LTL = FO.
Interestingly, Theorem 14.10 holds for ω-strings as well, but this is outside
the scope of this book.
For CTL, one needs to talk about diﬀerent paths, and hence one should
be able to express reachability properties such as “can a state labeled a be
reached from a state labeled b”? This suggests a close connection between
CTL and logics that can express the transitive closure operator. We illustrate
this by means of the following example.
Consider a CTL formula AFa stating that along every path, a eventually
holds. We now express this in a variant of Datalog. Let (Π1, T ) be the
following Datalog¬ program:
R(x, y) :– ¬Pa(x), E(x, y)
R(x, y) :– ¬Pa(z), R(x, z), E(z, y)
This program computes a subset of the transitive closure: the set of pairs
(b, b′
) for which there is a path b = b1, b2, . . . , bn−1, bn = b′
such that none of
the bi’s, i < n, is labeled a. Next, we deﬁne a program (Π2, U) that uses R as
an extensional predicate:
U(x) :– R(x, x)
U(x) :– ¬Pa(x), E(x, y), U(y)
Suppose we have an inﬁnite path over K. Since K is ﬁnite, it must have a loop.
If there is a loop such that R(x, x) holds, then there is an inﬁnite path from
x such that ¬a holds along this path. If we have any other path such that ¬a
holds along it, then it starts with a few edges and eventually enters a loop in
14.2 Temporal and Modal Logics 283
which no node is labeled a. Hence, U is the set of nodes from which there is
an inﬁnite path on which ¬a holds. Thus, taking the program (Π3, Q) given
by
Q(x) :– ¬U(x)
we get a program that computes AFa. Notice that this program is stratiﬁed
(for each stratum, the negated predicates are those deﬁned in the previous
strata) and linear (each intensional predicate appears at most once in the
right hand sides of rules). The above translation techniques can be extended
to prove the following.
Theorem 14.11. CTL formulae can be expressed in either of the following:
• the linear stratiﬁed Datalog¬;
• the transitive closure logic TrCl.
Next, we deﬁne a ﬁxed point modal logic, called the µ-calculus and denoted
by Calcµ, that subsumes LTL, CTL, and CTL∗
. Consider the propositional
modal logic ML, and extend its syntax with propositional variables x, y, . . .,
viewed as monadic second-order variables (i.e., each such variable denotes a
set of states). Now formulae have free variables. Suppose we have a formula
ϕ(x, y) where x occurs positively in ϕ. Then µx.ϕ(x, y) is a formula with free
variables y.
To deﬁne the semantics of ψ(y) ≡ µx.ϕ(x, y) on a Kripke structure K,
assume that each yi from y is interpreted as a propositional variable: that is,
a subset Yi of S consisting of nodes where it holds. Then ϕ(x, Y ) deﬁnes an
operator FY
ϕ : 2S
→ 2S
given by
FY
ϕ (X) = {s ∈ S | (K, s) |= ϕ(X, Y )}.
If x occurs positively, then this operator is monotone. We deﬁne the semantics
of the µ operator by
(K, s) |= µx.ϕ(x, Y ) ⇔ s ∈ lfp(FY
ϕ ).
Consider, for example, the formula µx.a ∨ x. This formula is true in (K, s)
if along each path starting in s, a will eventually become true. Hence, this is
the CTL formula AFa. In general, every CTL∗
formula can be expressed in
Calcµ.
Each Calcµ formula ϕ can be translated into an LFP formula ϕ◦
(x)
such that (K, s) |= ϕ iﬀ K |= ϕ◦
(s). Furthermore, one can show that Calcµ
formulae can be translated into MSO formulae as well. Summing up, we have
the following relationship between the temporal logics:
ML
LTL
CTL
CTL∗
Calcµ
LFP
MSO
.
284 14 Other Applications of Finite Model Theory
a
a
a
a a
K1 K2
Fig. 14.2. Bisimulation equivalence
In the µ-calculus, it is common to use both least and greatest ﬁxed points.
The latter are deﬁnable by νx.ϕ(x) ≡ ¬µx.¬ϕ(¬x), assuming that x occurs
positively in ϕ. Notice that negating both ϕ and each occurrence of x in it
ensures that if x occurs positively in ϕ, then it occurs positively in ¬ϕ(¬x),
and hence the least ﬁxed point is well-deﬁned. Using the greatest and the least
ﬁxed points, the formulae of the µ-calculus can be written in the alternating
style so that negations are applied only to propositions. We shall denote the
fragment of the µ-calculus that consists of such alternating formulae with
alternation depth at most k by Calck
µ.
Theorem 14.12. The complexity of the model-checking problem for Calck
µ
is O( ϕ · K
k
).
Since Calcµ can be embedded into LFP, its data complexity is polynomial.
The combined complexity is known to be in NP ∩ coNP. Furthermore,
Calcµ has the ﬁnite model property: if ϕ is a satisﬁable formula of Calcµ,
then there is a Kripke structure K of size at most exponential in ϕ such that
(K, s) |= ϕ for some s ∈ S.
Finally, we present another way to connect temporal logics with other logics
seen in this book. Since logics like Calcµ talk about temporal properties
of paths, they cannot distinguish structures in which all paths agree on all
temporal properties, even if the structures themselves are diﬀerent. For example,
consider the structures K1 and K2 shown in Fig. 14.2. Even though they
are diﬀerent, all the paths realized in these structures are the same: an inﬁnite
path on which every node is labeled a. Calcµ cannot see the diﬀerence
between them, although these structures are easily distinguished by the FO
sentence “There is a node with two distinct successors.”
One can formally capture this notion of indistinguishability using the deﬁnition
of bisimilarity. Let K = S, E, (Pa)a∈Σ and K′
= S′
, E′
, (P′
a)a∈Σ . We
say that (K, s) and (K′
, s′
) are bisimilar if there is a binary relation R ⊆ S ×S′
such that
• (s, s′
) ∈ R;
14.3 Constraint Satisfaction and Homomorphisms of Finite Models 285
• if (u, u′
) ∈ R, then Pa(u) iﬀ P′
a(u′
), for all a ∈ Σ;
• if (u, u′
) ∈ R and (u, v) ∈ E, then there is v′
∈ S′
such that (v, v′
) ∈ R
and (u′
, v′
) ∈ E′
;
• if (u, u′
) ∈ R and (u′
, v′
) ∈ E′
, then there is v ∈ S such that (v, v′
) ∈ R
and (u, v) ∈ E.
A property of Kripke structures is bisimulation-invariant if whenever it
holds in (K, s), it also holds in every (K′
, s′
) which is bisimilar to (K, s). As
we have seen, even FO can express properties which are not bisimulationinvariant,
but Calcµ and its sublogics only express bisimulation-invariant
properties.
The following result shows how to use bisimulation-invariance to relate
temporal logics and other logics seen in this book.
Theorem 14.13. • The class of bisimulation-invariant properties expressible
in FO is precisely the class of properties expressible in ML.
• The class of bisimulation-invariant properties expressible in MSO is precisely
the class of properties expressible in Calcµ.
14.3 Constraint Satisfaction and Homomorphisms of
Finite Models
Constraint satisfaction problems are problems of the following kind. Suppose
we are given a set V of variables, a ﬁnite domain D where the variables can
take values, and a set of constraints C. The problem is whether there exists
an assignment of values to variables that satisﬁes all the constraints.
Each constraint in the set C is speciﬁed as a pair (v, R) where v is a tuple of
variables from V , of length n, and R is an n-ary relation on D. The assignment
of values to variables is then a mapping h : V → D. Such a mapping satisﬁes
the constraint (v, R) if h(v) ∈ R.
For example, satisﬁability of certain propositional formulae can be viewed
as a constraint satisfaction problem. Consider the MONOTONE 3-SAT problem.
That is, we have a CNF formula ϕ(x1, . . . , xm) in which every clause
is either (xi ∨ xj ∨ xk), or (¬xi ∨ ¬xj ∨ ¬xk). Consider the constraint satisfaction
problem where V = {x1, . . . , xn}, D = {0, 1}, and for each clause
(xi∨xj∨xk) we have a constraint ((xi, xj, xk), {0, 1}3
−{(0, 0, 0)}), and for each
clause (¬xi ∨¬xj ∨¬xk) we have a constraint ((xi, xj, xk), {0, 1}3
−{(1, 1, 1)}).
Then the resulting constraint satisfaction problem (V, D, C) has a solution iﬀ
ϕ is satisﬁable.
There is a nice representation of constraint satisfaction problems in terms
of the existence of a certain homomorphism between ﬁnite structures.
Suppose we are given a constraint satisfaction problem P = (V, D, C). Let
RP
1 , . . . , RP
l list all the relations mentioned in C. Let σP = (R1, . . . , Rl). We
deﬁne two σP -structures as follows:
286 14 Other Applications of Finite Model Theory
AP = V, {v | (v, RP
1 ) ∈ C}, . . . , {v | (v, RP
m) ∈ C}
BP = D, RP
1 , . . . , RP
m .
Then
P has a solution
⇔ there exists a homomorphism h : AP → BP .
Thus, the constraint satisfaction problem is really the problem of checking
whether there is a homomorphism between two structures. We thus use the
notation
CSP(A, B) ⇔ there exists a homomorphism h : A → B.
To see another example, let Km be the clique on m elements. Then
CSP(G, Km) holds iﬀ G is m-colorable.
The constraint satisfaction problem can easily be related to conjunctive
query evaluation. Suppose we have a vocabulary σ that consists only of relation
symbols, and a σ-structure A. Let A = {a1, . . . , an}. We deﬁne the
Boolean conjunctive query CQA as
CQA ≡ ∃x1 . . . ∃xn
R∈σ (ai1 ,...,aim )∈RA
R(xi1 , . . . , xim ).
Proposition 14.14. CSP(A, B) is true iﬀ B |= CQA.
If C and C′
are two classes of structures, then we write CSP(C, C′
) for the
class of problems CSP(A, B) where A ∈ C and B ∈ C′
. We use All for the
class of all ﬁnite structures.
The m-colorability example shows that CSP(All, All) contains NP-hard
problems. Furthermore, each problem in CSP(All, All) can be solved in NP:
given A and B, we simply guess a mapping h : A → B, and check, in polynomial
time, if it is a homomorphism between A and B. Thus, CSP(All, All) is
NP-complete.
This naturally leads to the following question: under what conditions is
CSP(C, C′
) tractable?
We ﬁrst answer this question in the setting suggested by the examples
of MONOTONE 3-SAT and m-colorability. In both of these examples, we
were interested in the problem of the form CSP(All, B); that is, in the existence
of a homomorphism into a ﬁxed structure. This is a very common class
of constraint satisfaction problems. We shall write CSP(B) for CSP(All, B).
Thus, the ﬁrst question we address is when CSP(B) can be guaranteed to be
tractable.
All problems of the form CSP(B) whose complexity is known fall into
two categories: they are either tractable, or NP-complete. This is a real dichotomy:
if Ptime = NP, there are NP problems which are neither tractable
14.3 Constraint Satisfaction and Homomorphisms of Finite Models 287
nor NP-complete. In fact, it has been conjectured that for any B, the problem
CSP(B) is either tractable, or NP-complete. In general, this conjecture
remains unproven, but some partial solutions are known. For example:
Theorem 14.15. For every B with |B| ≤ 3, CSP(B) is either tractable, or
NP-complete.
Moreover, for the case of |B| = 2 (so-called Boolean constraint satisfaction
problem), one can classify precisely for which structures B the corresponding
problem CSP(B) is tractable.
For more general structures B, one can use logical deﬁnability to ﬁnd some
fairly large classes that guarantee tractability.
If one tries to think of a logic in which CSP(B) can be expressed, one
immediately thinks of MSO. Indeed, suppose that the universe of B is
{b0, . . . , bn−1}. Then the MSO sentence characterizing CSP(B) is of the form
∃X0 . . . ∃Xn−1 Ψ,
where Ψ is an FO sentence stating that, on a structure A expanded with n
sets X0, . . . , Xn−1, the sets Xi form a partition of A, and the map deﬁned by
sending all elements of Xi into bi, for i = 0, . . . , n − 1, is a homomorphism
from A to B.
However, while in many cases MSO is tractable, in general it is not suitable
to establish tractability results without putting restrictions on a class of
structures A, since MSO can express NP-complete problems.
To express CSP(B) in a tractable logic, we instead consider the negation
of CSP(B): that is,
¬CSP(B) = {A | there is no homomorphism h : A → B}.
If A ∈ ¬CSP(B) and A is a substructure of A′
, then A′
∈ ¬CSP(B).
This monotonicity property suggests that for some B, the class ¬CSP(B)
could be deﬁnable in a rather expressive tractable monotone language such as
Datalog. If this were the case, then CSP(B) would be tractable as well.
Trying to express ¬CSP(B) in Datalog may be a bit hard, but it turns
out that instead one could attempt to express ¬CSP(B) in a richer inﬁnitary
logic.
Theorem 14.16. For each B, the problem ¬CSP(B) is expressible in
Datalog iﬀ it is expressible in ∃Lω
∞ω.
Thus, one general way of achieving tractability is to show that the negation
of the constraint satisfaction problem is expressible in the existential fragment
of the very rich ﬁnite variable logic Lω
∞ω.
Moving back to the general problem CSP(C, C′
), one may ask whether
CSP(C, C′
) is tractable whenever CSP(C, B) is tractable for all B ∈ C′
. The
288 14 Other Applications of Finite Model Theory
answer to this is negative: for each ﬁxed graph G, the problem CSP({Km |
m ∈ N}, G) is tractable, but CSP({Km | m ∈ N}, All) is not. However, for the
class of structures above, a uniform version of the tractability result can be
shown.
Theorem 14.17. Let CDatalogk be the class of structures B such that
¬CSP(B) is expressible by a Datalog program that uses at most k distinct
variables. Then CSP(All, CDatalogk ) is in Ptime.
Yet another tractable restriction uses the notion of treewidth encountered
in Chap. 6. If we let TWk be the class of graphs of treewidth at most k, then
one can show that ¬CSP(TWk, B) is expressible in Datalog (in fact, in the
k-variable fragment of Datalog). Hence, CSP(TWk, B) is tractable.
In fact, this can be generalized as follows. We call two structures A and
B homomorphically equivalent if there exist homomorphisms h : A → B
and h′
: B → A. Let HTWk be the class of all structures homomorphically
equivalent to a structure in TWk.
Theorem 14.18. CSP(HTWk, All) can be expressed in LFP (in fact, using at
most 2k variables) and consequently is in Ptime.
Thus, deﬁnability results for ﬁxed point and ﬁnite variable logics describe
rather large classes of tractable constraint satisfaction problems.
14.4 Bibliographic Notes
A comprehensive survey of decidable and undecidable cases for the satisﬁability
problem is given in B¨orger, Gr¨adel, and Gurevich [25]. It describes
both the Bernays-Sch¨onﬁnkel and Ackermann classes, and proves complexity
bounds for them. The ﬁnite model property for FO2
is due to Mortimer
[184]; the complexity bound is from Gr¨adel, Kolaitis, and Vardi [100]. The
Πp
2 -completeness of containment of unions of conjunctive queries is due to
Sagiv and Yannakakis [211].
There are a number of books and surveys in which temporal and modal
logics are described in detail: van Benthem [240], Clarke, Grumberg, and Peled
[37], Emerson [64, 65], Vardi [246]. Theorem 14.10 is from Kamp [141]. Abiteboul,
Herr, and Van den Bussche [2] showed that Kamp’s theorem no longer
holds if one moves from strings to arbitrary structures. It is also known that
for the translation from LTL to FO, three variables suﬃce (i.e., over strings,
LTL equals FO3
, see, e.g., Schneider [214]), but two variables do not suﬃce
(as shown by Etessami, Vardi, and Wilke [69]). The example of expressing
a CTL property in Datalog is from Gottlob, Gr¨adel, and Veith [93], and
Theorem 14.11 is from [93] and Immerman and Vardi [136]. Equivalence of
bisimulation-invariant FO and modal logic is from van Benthem [240], and
14.4 Bibliographic Notes 289
the corresponding result for MSO and Calcµ is from Janin and Walukiewicz
[138]; for a related result about CTL∗
, see Moller and Rabinovich [183].
Constraint satisfaction is a classical AI problem (see, e.g., Tsang [235]).
The idea of viewing constraint satisfaction as the existence of a homomorphism
between two structures is due to Feder and Vardi [77]. They also suggested
using expressibility in Datalog as a tool for proving tractability, and formulated
the dichotomy conjecture. Theorem 14.15 is due to Schaefer [213] (for
|B| = 2) and Bulatov [28] (for |B| = 3). The existence of complexity classes
between Ptime and NP-complete, mentioned before Theorem 14.15, is due to
Ladner [159]. Other results in that section are from Kolaitis and Vardi [156]
and Dalmau, Kolaitis, and Vardi [48]. The converse of Theorem 14.18 was
proved recently by Grohe [112].
References
1. S. Abiteboul, K. Compton, and V. Vianu. Queries are easier than you thought
(probably). In ACM Symp. on Principles of Database Systems, 1992, ACM
Press, pages 23–32.
2. S. Abiteboul, L. Herr, and J. Van den Bussche. Temporal connectives versus
explicit timestamps to query temporal databases. Journal of Computer and
System Sciences, 58 (1999), 54–68.
3. S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases, AddisonWesley,
1995.
4. S. Abiteboul, M. Y. Vardi, and V. Vianu. Fixpoint logics, relational machines,
and computational complexity. Journal of the ACM, 44 (1997), 30–56.
5. S. Abiteboul and V. Vianu. Fixpoint extensions of ﬁrst-order logic and dataloglike
languages. In Proc. IEEE Symp. on Logic in Computer Science, 1989, pages
71–79.
6. S. Abiteboul and V. Vianu. Computing with ﬁrst-order logic. Journal of
Computer and System Sciences, 50 (1995), 309–335.
7. J.W. Addison, L. Henkin, A. Tarski, eds. The Theory of Models. NorthHolland,
1965.
8. F. Afrati, S. Cosmadakis, and M. Yannakakis. On datalog vs. polynomial time.
Journal of Computer and System Sciences, 51 (1995), 177–196.
9. A. Aho and J. Ullman. The universality of data retrieval languages. In
Proc. ACM Symp. on Principles of Programming Languages, 1979, ACM Press,
pages 110–120.
10. M. Ajtai. Σ1
1 formulae on ﬁnite structures. Annals of Pure and Applied Logic,
24 (1983), 1–48.
11. M. Ajtai and R. Fagin. Reachability is harder for directed than for undirected
graphs. Journal of Symbolic Logic, 55 (1990), 113–150.
12. M. Ajtai, R. Fagin, and L. Stockmeyer. The closure of monadic NP. Journal
of Computer and System Sciences, 60 (2000), 660–716.
13. M. Ajtai and Y. Gurevich. Monotone versus positive. Journal of the ACM, 34
(1987), 1004–1015.
292 References
14. G. Asser. Das Repr¨asentantenproblem im Pr¨adikatenkalk¨ul der Ersten Stufe
mit Identit¨at. Zeitschrift f¨ur Mathematische Logik und Grundlagen der Mathematik,
1 (1955), 252–263.
15. D.A.M. Barrington, N. Immerman, C. Lautemann, N. Schweikardt, and
D. Th´erien. The Crane Beach conjecture. In IEEE Symp. on Logic in Computer
Science, 2001, pages 187–196.
16. D.A.M. Barrington, N. Immerman, and H. Straubing. On uniformity within
NC1
. Journal of Computer and System Sciences, 41 (1990), 274–306.
17. J. Barwise. On Moschovakis closure ordinals. Journal of Symbolic Logic, 42
(1977), 292–296.
18. J. Barwise and S. Feferman, eds. Model-Theoretic Logics. Springer-Verlag,
1985.
19. M. Benedikt and L. Libkin. Relational queries over interpreted structures.
Journal of the ACM, 47 (2000), 644–680.
20. M. Benedikt and L. Libkin. Tree extension algebras: logics, automata, and
query languages. In IEEE Symp. on Logic in Computer Science, 2002, pages
203–212.
21. M. Benedikt, L. Libkin, T. Schwentick, and L. Segouﬁn. Deﬁnable relations
and ﬁrst-order query languages over strings. Journal of the ACM, 50 (2003),
694–751.
22. A. Blass, Y. Gurevich, and D. Kozen. A zero-one law for logic with a ﬁxed-point
operator. Information and Control, 67 (1985), 70–90.
23. A. Blumensath and E. Gr¨adel. Automatic structures. In IEEE Symp. on Logic
in Computer Science, 2000, pages 51–62.
24. H. Bodlaender. A linear-time algorithm for ﬁnding tree-decompositions of small
treewidth. SIAM Journal on Computing, 25 (1996), 1305–1317.
25. E. B¨orger, E. Gr¨adel, and Y. Gurevich. The Classical Decision Problem.
Springer-Verlag, 1997.
26. V. Bruy`ere, G. Hansel, C. Michaux, and R. Villemaire. Logic and precognizable
sets of integers. Bulletin of the Belgian Mathematical Society,
1 (1994), 191–238.
27. J.R. B¨uchi. Weak second-order arithmetic and ﬁnite automata. Zeitschrift f¨ur
Mathematische Logik und Grundlagen der Mathematik, 6 (1960), 66–92.
28. A. Bulatov. A dichotomy theorem for constraints on a three-element set. IEEE
Symp. on Foundations of Computer Science, 2002, pages 649–658.
29. S. R. Buss. First-order proof theory of arithmetic. In Handbook of Proof
Theory, Elsevier, Amsterdam, 1998, pages 79–147.
30. J. Cai, M. F¨urer, and N. Immerman. On optimal lower bound on the number
of variables for graph identiﬁcation. Combinatorica, 12 (1992), 389–410.
31. P.J. Cameron. The random graph revisited. In Eur. Congr. of Mathematics,
Vol. 1, Progress in Mathematics, Birkh¨auser, 2001, pages 267–274.
32. A. Chandra and D. Harel. Computable queries for relational databases. Journal
of Computer and System Sciences, 21 (1980), 156–178.
33. A. Chandra and D. Harel. Structure and complexity of relational queries.
Journal of Computer and System Sciences, 25 (1982), 99–128.
References 293
34. A. Chandra and P. Merlin. Optimal implementation of conjunctive queries in
relational data bases. In ACM Symp. on Theory of Computing, 1977, pages
77–90.
35. C.C. Chang and H.J. Keisler. Model Theory. North-Holland, 1990.
36. O. Chapuis and P. Koiran. Deﬁnability of geometric properties in algebraically
closed ﬁelds. Mathematical Logic Quarterly, 45 (1999), 533–550.
37. E. Clarke, O. Grumberg, and D. Peled. Model Checking. The MIT Press, 1999.
38. H. Comon, M. Dauchet, R. Gilleron, F. Jacquemard, D. Lugiez, S. Tison,
and M. Tommasi. Tree Automata: Techniques and Applications. Available at
www.grappa.univ-lille3.fr/tata. October 2002.
39. S.A. Cook. The complexity of theorem-proving procedures. In Proc. ACM
Symp. on Theory of Computing, 1971, ACM Press, pages 151–158.
40. S.A. Cook. Proof complexity and bounded arithmetic. Manuscript, Univ. of
Toronto, 2002.
41. S.A. Cook and Y. Liu. A complete axiomatization for blocks world. In Proc. 7th
Int. Symp. on Artiﬁcial Intelligence and Mathematics, January, 2002.
42. S. Cosmadakis. Logical reducibility and monadic NP. In Proc. IEEE Symp.
on Foundations of Computer Science, 1993, pages 52–61.
43. S. Cosmadakis, H. Gaifman, P. Kanellakis, and M. Vardi. Decidable optimization
problems for database logic programs. In ACM Symp. on Theory of
Computing, 1988, pages 477–490.
44. B. Courcelle. Graph rewriting: an algebraic and logic approach. In Handbook
of Theoretical Computer Science, Vol. B, North-Holland, 1990, pages 193–242.
45. B. Courcelle. On the expression of graph properties in some fragments of
monadic second-order logic. In [134], pages 33–62.
46. B. Courcelle. The monadic second-order logic on graphs VI: on several representations
of graphs by relational structures. Discrete Applied Mathematics,
54 (1994), 117–149.
47. B. Courcelle and J. Makowsky. Fusion in relational structures and the veriﬁcation
of monadic second-order properties. Mathematical Structures in Computer
Science, 12 (2002), 203–235.
48. V. Dalmau, Ph. Kolaitis, and M. Vardi. Constraint satisfaction, bounded
treewidth, and ﬁnite-variable logics. Proc. Principles and Practice of Constraint
Programming, Springer-Verlag LNCS 2470, 2002, pages 310–326.
49. A. Dawar. A restricted second order logic for ﬁnite structures. Logic and
Computational Complexity, Springer-Verlag, LNCS 960, 1994, pages 393–413.
50. A. Dawar, K. Doets, S. Lindell, and S. Weinstein. Elementary properties of
ﬁnite ranks. Mathematical Logic Quarterly, 44 (1998), 349–353.
51. A. Dawar and Y. Gurevich. Fixed point logics. Bulletin of Symbolic Logic, 8
(2002), 65-88.
52. A. Dawar and L. Hella. The expressive power of ﬁnitely many generalized
quantiﬁers. Information and Computation, 123 (1995), 172–184.
53. A. Dawar, S. Lindell, and S. Weinstein. Inﬁnitary logic and inductive deﬁnability
over ﬁnite structures. Information and Computation, 119 (1995), 160–175.
294 References
54. A. Dawar, S. Lindell, and S. Weinstein. First order logic, ﬁxed point logic,
and linear order. In Computer Science Logic, Springer-Verlag LNCS Vol. 1092,
1995, pages 161–177.
55. L. Denenberg, Y. Gurevich, and S. Shelah. Deﬁnability by constant-depth
polynomial-size circuits. Information and Control, 70 (1986), 216–240.
56. M. de Rougemont. Second-order and inductive deﬁnability on ﬁnite structures.
Zeitschrift f¨ur Mathematische Logik und Grundlagen der Mathematik,
33 (1987), 47–63.
57. G. Dong, L. Libkin, and L. Wong. Local properties of query languages. Theoretical
Computer Science, 239 (2000), 277–308.
58. R. Downey and M. Fellows. Parameterized Complexity. Springer-Verlag, 1999.
59. D.-Z. Du, K.-I. Ko. Theory of Computational Complexity. Wiley-Interscience,
2000.
60. H.-D. Ebbinghaus and J. Flum. Finite Model Theory. Springer-Verlag, 1995.
61. H.-D. Ebbinghaus, J. Flum, and W. Thomas. Mathematical Logic. SpringerVerlag,
1984.
62. A. Ehrenfeucht. An application of games to the completeness problem for
formalized theories. Fundamenta Mathematicae, 49 (1961), 129–141.
63. T. Eiter, G. Gottlob, and Y. Gurevich. Existential second-order logic over
strings. Journal of the ACM, 47 (2000), 77–131.
64. E.A. Emerson. Temporal and modal logic. In Handbook of Theoretical Computer
Science, Vol. B, North-Holland, 1990, pages 995–1072.
65. E.A. Emerson. Model checking and the mu-calculus. In [134], pages 185–214.
66. H. Enderton. A Mathematical Introduction to Logic. Academic-Press, 1972.
67. P. Erd¨os and A. R´enyi. Asymmetric graphs. Acta Mathematicae Academiae
Scientiarum Hungaricae, 14 (1963), 295–315.
68. K. Etessami. Counting quantiﬁers, successor relations, and logarithmic space.
Journal of Computer and System Sciences, 54 (1997), 400–411.
69. K. Etessami, M.Y. Vardi, and T. Wilke. First-order logic with two variables
and unary temporal logic. Information and Computation, 179 (2002), 279–295.
70. R. Fagin. Generalized ﬁrst-order spectra and polynomial-time recognizable
sets. In Complexity of Computation, R. Karp, ed., SIAM-AMS Proceedings, 7
(1974), 43–73.
71. R. Fagin. Monadic generalized spectra. Zeitschrift f¨ur Mathematische Logik
und Grundlagen der Mathematik, 21 (1975), 89–96.
72. R. Fagin. A spectrum hierarchy. Zeitschrift f¨ur Mathematische Logik und
Grundlagen der Mathematik, 21 (1975), 123–134.
73. R. Fagin. Probabilities on ﬁnite models. Journal of Symbolic Logic, 41 (1976),
50–58.
74. R. Fagin. Finite-model theory — a personal perspective. Theoretical Computer
Science, 116 (1993), 3–31.
75. R. Fagin. Easier ways to win logical games. In [134], pages 1–32.
76. R. Fagin, L. Stockmeyer, and M.Y. Vardi. On monadic NP vs monadic co-NP.
Information and Computation, 120 (1994), 78–92.
References 295
77. T. Feder and M.Y. Vardi. The computational structure of monotone monadic
SNP and constraint satisfaction: a study through datalog and group theory.
SIAM Journal on Computing, 28 (1998), 57–104.
78. T. Feder and M.Y. Vardi. Homomorphism closed vs. existential positive. IEEE
Symp. on Logic in Computer Science, 2003, pages 311–320.
79. S. Feferman and R. Vaught. The ﬁrst order properties of products of algebraic
systems. Fundamenta Mathematicae, 47 (1959), 57–103.
80. J. Flum, M. Frick, and M. Grohe. Query evaluation via tree-decompositions.
Journal of the ACM, 49 (2002), 716–752.
81. J. Flum and M. Grohe. Fixed-parameter tractability, deﬁnability, and modelchecking.
SIAM Journal on Computing 31 (2001), 113–145.
82. J. Flum and M. Ziegler. Pseudo-ﬁnite homogeneity and saturation. Journal of
Symbolic Logic, 64 (1999), 1689–1699.
83. H. Fournier. Quantiﬁer rank for parity of embedded ﬁnite models. Theoretical
Computer Science, 295 (2003), 153–169.
84. R. Fra¨ıss´e. Sur quelques classiﬁcations des syst`emes de relations. Universit´e
d’Alger, Publications Scientiﬁques, S´erie A, 1 (1954), 35–182.
85. M. Frick and M. Grohe. The complexity of ﬁrst-order and monadic secondorder
logic revisited. In IEEE Symp. on Logic in Computer Science, 2002,
pages 215–224.
86. M. Furst, J. Saxe, and M. Sipser. Parity, circuits, and the polynomial-time
hierarchy. Mathematical Systems Theory, 17 (1984), 13–27.
87. H. Gaifman. Concerning measures in ﬁrst-order calculi. Israel Journal of
Mathematics, 2 (1964), 1–17.
88. H. Gaifman. On local and non-local properties, Proc. Herbrand Symp., Logic
Colloquium ’81, North-Holland, 1982.
89. H. Gaifman and M.Y. Vardi. A simple proof that connectivity is not ﬁrst-order
deﬁnable. Bulletin of the EATCS, 26 (1985), 43–45.
90. F. G´ecseg and M. Steinby. Tree languages. In Handbook of Formal Languages,
Vol. 3. Springer-Verlag, 1997, pages 1–68.
91. F. Gire and H. K. Hoang. A more expressive deterministic query language with
eﬃcient symmetry-based choice construct. In Logic in Databases, Int. Workshop
LID’96, Springer-Verlag, 1996, pages 475–495.
92. Y.V. Glebskii, D.I. Kogan, M.A. Liogon’kii, and V.A. Talanov Range and degree
of realizability of formulas in predicate calculus (in Russian). Kibernetika,2
(1969), 17–28.
93. G. Gottlob, E. Gr¨adel, and H. Veith. Datalog LITE: a deductive query language
with linear time model checking. ACM Transactions on Computational Logic,
3 (2002), 42–79.
94. G. Gottlob and C. Koch. Monadic datalog and the expressive power of languages
for Web information extraction. Journal of the ACM, 51 (2004), 74–113.
95. G. Gottlob, Ph. Kolaitis, and T. Schwentick. Existential second-order logic
over graphs: charting the tractability frontier. In IEEE Symp. on Foundations
of Computer Science, 2000, pages 664–674.
296 References
96. G. Gottlob, N. Leone, and F. Scarcello. The complexity of acyclic conjunctive
queries. Journal of the ACM, 48 (2001), 431–498.
97. E. Gr¨adel. Capturing complexity classes by fragments of second order logic.
Theoretical Computer Science, 101 (1992), 35–57.
98. E. Gr¨adel and Y. Gurevich. Metaﬁnite model theory. Information and Computation,
140 (1998), 26–81.
99. E. Gr¨adel, Ph. Kolaitis, L. Libkin, M. Marx, J. Spencer, M.Y. Vardi, Y. Venema,
S. Weinstein. Finite Model Theory and its Applications. Springer-Verlag,
2004.
100. E. Gr¨adel, Ph. Kolaitis, and M.Y. Vardi. On the decision problem for twovariable
ﬁrst-order logic. Bulletin of Symbolic Logic, 3 (1997), 53–69.
101. E. Gr¨adel and G. McColm. On the power of deterministic transitive closures.
Information and Computation, 119 (1995), 129–135.
102. E. Gr¨adel and M. Otto. Inductive deﬁnability with counting on ﬁnite structures.
Proc. Computer Science Logic, 1992, Springer-Verlag, pages 231–247.
103. R.L. Graham, B.L. Rothschild and J.H. Spencer. Ramsey Theory. John Wiley
& Sons, 1990.
104. E. Grandjean. Complexity of the ﬁrst-order theory of almost all ﬁnite structures.
Information and Control, 57 (1983), 180–204.
105. E. Grandjean and F. Olive. Monadic logical deﬁnability of nondeterministic
linear time. Computational Complexity, 7 (1998), 54–97.
106. M. Grohe. The structure of ﬁxed-point logics. PhD Thesis, University of
Freiburg, 1994.
107. M. Grohe. Fixed-point logics on planar graphs. In IEEE Symp. on Logic in
Computer Science, 1998, pages 6–15.
108. M. Grohe. Equivalence in ﬁnite-variable logics is complete for polynomial time.
Combinatorica, 19 (1999), 507–532.
109. M. Grohe. The parameterized complexity of database queries. In ACM Symp.
on Principles of Database Systems, 2001, ACM Press, pages 82–92.
110. M. Grohe. Large ﬁnite structures with few Lk
-types. Information and Computation,
179 (2002), 250–278.
111. M. Grohe. Parameterized complexity for the database theorist. SIGMOD
Record, 31 (2002), 86–96.
112. M. Grohe. The complexity of homomorphism and constraint satisfaction problems
seen from the other side. In IEEE Symp. on Foundations of Computer
Science, 2003, pages 552–561.
113. M. Grohe and T. Schwentick. Locality of order-invariant ﬁrst-order formulas.
ACM Transactions on Computational Logic, 1 (2000), 112–130.
114. M. Grohe, T. Schwentick, and L. Segouﬁn. When is the evaluation of conjunctive
queries tractable? In ACM Symp. on Theory of Computing, 2001, pages
657–666.
115. S. Grumbach and J. Su. Queries with arithmetical constraints. Theoretical
Computer Science, 173 (1997), 151–181.
116. Y. Gurevich. Toward logic tailored for computational complexity. In Computation
and Proof Theory, M. Richter et al., eds., Springer Lecture Notes in
References 297
Mathematics, Vol. 1104, 1984, pages 175–216.
117. Y. Gurevich. Logic and the challenge of computer science. In Current trends
in theoretical computer science, E. B¨orger, ed., Computer Science Press, 1988,
pages 1–57.
118. Y. Gurevich, N. Immerman, and S. Shelah. McColm’s conjecture. In IEEE
Symp. on Logic in Computer Science, 1994, 10–19.
119. Y. Gurevich and S. Shelah. Fixed-point extensions of ﬁrst-order logic. Annals
of Pure and Applied Logic, 32 (1986), 265–280.
120. W. Hanf. Model-theoretic methods in the study of elementary logic. In [7],
pages 132–145.
121. L. Hella. Logical hierarchies in PTIME. Information and Computation, 129
(1996), 1–19.
122. L. Hella, Ph. Kolaitis, and K. Luosto. Almost everywhere equivalence of logics
in ﬁnite model theory. Bulletin of Symbolic Logic, 2 (1996), 422–443.
123. L. Hella, L. Libkin, and J. Nurmonen. Notions of locality and their logical
characterizations over ﬁnite models. Journal of Symbolic Logic, 64 (1999),
1751–1773.
124. L. Hella, L. Libkin, J. Nurmonen, and L. Wong. Logics with aggregate operators.
Journal of the ACM, 48 (2001), 880–907.
125. W. Hodges. Model Theory. Cambridge University Press, 1993.
126. J. Hopcroft and J. Ullman. Introduction to Automata Theory, Languages, and
Computation. Addison-Wesley, 1979.
127. R. Hull and J. Su. Domain independence and the relational calculus. Acta
Informatica, 31 (1994), 513–524.
128. N. Immerman. Upper and lower bounds for ﬁrst order expressibility. Journal
of Computer and System Sciences, 25 (1982), 76–98.
129. N. Immerman. Relational queries computable in polynomial time (extended
abstract). In ACM Symp. on Theory of Computing, 1982, ACM Press, pages
147–152.
130. N. Immerman. Relational queries computable in polynomial time. Information
and Control, 68 (1986), 86–104.
131. N. Immerman. Languages that capture complexity classes. SIAM Journal on
Computing, 16 (1987), 760–778.
132. N. Immerman. Nondeterministic space is closed under complementation. SIAM
Journal on Computing, 17 (1988), 935–938.
133. N. Immerman. Descriptive Complexity. Springer-Verlag, 1998.
134. N. Immerman and Ph. Kolaitis, eds. Descriptive Complexity and Finite Models,
Proc. of a DIMACS workshop. AMS, 1997.
135. N. Immerman and E. Lander. Describing graphs: a ﬁrst order approach
to graph canonization. In Complexity Theory Retrospective, Springer-Verlag,
Berlin, 1990.
136. N. Immerman and M.Y. Vardi. Model checking and transitive-closure logic. In
Proc. Int. Conf. on Computer Aided Veriﬁcation, Springer-Verlag LNCS 1254,
1997, pages 291–302.
298 References
137. D. Janin and J. Marcinkowski. A toolkit for ﬁrst order extensions of monadic
games. Proc. of Symp. on Theoretical Aspects of Computer Science, SpringerVerlag
LNCS vol. 2010, Springer Verlag, 2001, 353–364.
138. D. Janin and I. Walukiewicz. On the expressive completeness of the propositional
mu-calculus with respect to monadic second order logic. In Proc. of
CONCUR’96, Springer-Verlag LNCS 1119, 1996, pages 263–277.
139. D.S. Johnson. A catalog of complexity classes. In Handbook of Theoretical
Computer Science, Vol. A, North-Holland, 1990, pages 67–161.
140. N. Jones and A. Selman. Turing machines and the spectra of ﬁrst-order formulas.
Journal of Symbolic Logic, 39 (1974), 139–150.
141. H. Kamp. Tense logic and the theory of linear order. PhD Thesis, University
of California, Los Angeles, 1968.
142. P. Kanellakis, G. Kuper, and P. Revesz. Constraint query languages. Journal
of Computer and System Sciences, 51 (1995), 26–52.
143. C. Karp. Finite quantiﬁer equivalence. In [7], pages 407–412.
144. M. Kaufmann and S. Shelah. On random models of ﬁnite power and monadic
logic. Discrete Mathematics, 54 (1985), 285–293.
145. B. Khoussainov and A. Nerode. Automata Theory and its Applications.
Birkh¨auser, 2001.
146. S. Kleene. Arithmetical predicates and function quantiﬁers. Transactions of
the American Mathematical Society, 79 (1955), 312–340.
147. Ph. Kolaitis. Languages for polynomial-time queries – an ongoing quest. In
Proc. 5th Int. Conf. on Database Theory, Springer-Verlag, 1995, pages 38–39.
148. Ph. Kolaitis. On the expressive power of logics on ﬁnite models. In [99].
149. Ph. Kolaitis and J. V¨a¨an¨anen. Generalized quantiﬁers and pebble games on
ﬁnite structures. Annals of Pure and Applied Logic, 74 (1995), 23–75.
150. Ph. Kolaitis and M.Y. Vardi. The decision problem for the probabilities of
higher-order properties. In ACM Symp. on Theory of Computing, 1987, pages
425–435.
151. Ph. Kolaitis and M.Y. Vardi. 0-1 laws and decision problems for fragments of
second-order logic. Information and Computation, 87 (1990), 301–337.
152. Ph. Kolaitis and M.Y. Vardi. Inﬁnitary logic and 0-1 laws. Information and
Computation, 98 (1992), 258–294.
153. Ph. Kolaitis and M.Y. Vardi. Fixpoint logic vs. inﬁnitary logic in ﬁnite-model
theory. In IEEE Symp. on Logic in Computer Science, 1992, pages 46–57.
154. Ph. Kolaitis and M.Y. Vardi. On the expressive power of Datalog: tools and a
case study. Journal of Computer and System Sciences, 51 (1995), 110–134.
155. Ph. Kolaitis and M.Y. Vardi. 0-1 laws for fragments of existential secondorder
logic: a survey. In Proc. Mathematical Foundations of Computer Science,
Springer-Verlag LNCS 1893, 2000, pages 84–98.
156. Ph. Kolaitis and M.Y. Vardi. Conjunctive-query containment and constraint
satisfaction. Journal of Computer and System Sciences, 61 (2000), 302–332.
157. B. Kuijpers, J. Paredaens, and J. Van den Bussche. Topological elementary
equivalence of closed semi-algebraic sets in the real plane. Journal of Symbolic
Logic, 65 (2000), 1530–1555.
References 299
158. G. Kuper, L. Libkin, and J. Paredaens, eds. Constraint Databases. SpringerVerlag,
2000.
159. R.E. Ladner. On the structure of polynomial time reducibility. Journal of the
ACM, 22 (1975), 155–171.
160. R.E. Ladner. Application of model theoretic games to discrete linear orders
and ﬁnite automata. Information and Control, 33 (1977), 281–303.
161. C. Lautemann, N. Schweikardt, and T. Schwentick. A logical characterisation of
linear time on nondeterministic Turing machines. In Proc. Symp. on Theoretical
Aspects of Computer Science, Springer-Verlag LNCS 1563, 1999, pages 143–
152.
162. C. Lautemann, T. Schwentick, and D. Th´erien. Logics for context-free languages.
In Proc. Computer Science Logic 1994, Springer-Verlag, 1995, pages
205–216.
163. J.-M. Le Bars. Fragments of existential second-order logic without 0-1 laws.
In IEEE Symp. on Logic in Computer Science, 1998, pages 525–536.
164. J.-M. Le Bars. The 0-1 law fails for monadic existential second-order logic on
undirected graphs. Information Processing Letters, 77 (2001), 43–48.
165. D. Leivant. Inductive deﬁnitions over ﬁnite structures. Information and Computation
89 (1990), 95–108.
166. L. Libkin. On counting logics and local properties. ACM Transactions on
Computational Logic, 1 (2000), 33–59.
167. L. Libkin. Logics capturing local properties. ACM Transactions on Computational
Logic, 2 (2001), 135–153.
168. L. Libkin. Embedded ﬁnite models and constraint databases. In [99].
169. L. Libkin and L. Wong. Query languages for bags and aggregate functions.
Journal of Computer and System Sciences, 55 (1997), 241–272.
170. L. Libkin and L. Wong. Lower bounds for invariant queries in logics with
counting. Theoretical Computer Science, 288 (2002), 153–180.
171. S. Lindell. An analysis of ﬁxed-point queries on binary trees. Theoretical
Computer Science, 85 (1991), 75–95.
172. A.B. Livchak. Languages for polynomial-time queries (in Russian). In
Computer-based Modeling and Optimization of Heat-power and Electrochemical
Objects Sverdlovsk, 1982, page 41.
173. J. Lynch. Almost sure theories. Annals of Mathematical Logic, 18 (1980),
91–135.
174. J. Lynch. Complexity classes and theories of ﬁnite models. Mathematical
Systems Theory, 15 (1982), 127–144.
175. R.C. Lyndon. An interpolation theorem in the predicate calculus. Paciﬁc
Journal of Mathematics, 9 (1959), 155–164.
176. J. Makowsky. Model theory and computer science: an appetizer. In Handbook
of Logic in Computer Science, Vol. 1, Oxford University Press, 1992.
177. J. Makowsky. Algorithmic aspects of the Feferman-Vaught Theorem. Annals
of Pure and Applied Logic, 126 (2004), 159–213.
178. J. Makowsky and Y. Pnueli. Arity and alternation in second-order logic. Annals
of Pure and Applied Logic, 78 (1996), 189–202.
300 References
179. J. Marcinkowski. Achilles, turtle, and undecidable boundedness problems for
small datalog programs. SIAM Journal on Computing, 29 (1999), 231–257.
180. O. Matz, N. Schweikardt, and W. Thomas. The monadic quantiﬁer alternation
hierarchy over grids and graphs. Information and Computation, 179 (2002),
356–383.
181. G.L. McColm. When is arithmetic possible? Annals of Pure and Applied Logic,
50 (1990), 29–51.
182. R. McNaughton and S. Papert. Counter-Free Automata. MIT Press, 1971.
183. F. Moller and A. Rabinovich. On the expressive power of CTL. In IEEE Symp.
on Logic in Computer Science, 1999, pages 360-369.
184. M. Mortimer. On language with two variables. Zeitschrift f¨ur Mathematische
Logik und Grundlagen der Mathematik, 21 (1975), 135–140.
185. Y. Moschovakis. Elementary Induction on Abstract Structures. North-Holland,
1974.
186. F. Neven. Automata theory for XML researchers. SIGMOD Record, 31 (2002),
39–46.
187. F. Neven and T. Schwentick. Query automata on ﬁnite trees. Theoretical
Computer Science, 275 (2002), 633–674.
188. J. Nurmonen. On winning strategies with unary quantiﬁers. Journal of Logic
and Computation, 6 (1996), 779–798.
189. J. Nurmonen. Counting modulo quantiﬁers on ﬁnite structures. Information
and Computation, 160 (2000), 62–87.
190. M. Otto. A note on the number of monadic quantiﬁers in monadic Σ1
1 . Information
Processing Letters, 53 (1995), 337–339.
191. M. Otto. Bounded Variable Logics and Counting: A Study in Finite Models.
Springer-Verlag, 1997.
192. M. Otto. Epsilon-logic is more expressive than ﬁrst-order logic over ﬁnite
structures. Journal of Symbolic Logic, 65 (2000), 1749–1757.
193. M. Otto and J. Van den Bussche. First-order queries on databases embedded
in an inﬁnite structure. Information Processing Letters, 60 (1996), 37–41.
194. C. Papadimitriou. A note on the expressive power of Prolog. Bulletin of the
EATCS, 26 (1985), 21–23.
195. C. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.
196. C. Papadimitriou and M. Yannakakis. On the complexity of database queries.
Journal of Computer and System Sciences, 58 (1999), 407–427.
197. J. Paredaens, J. Van den Bussche, and D. Van Gucht. First-order queries
on ﬁnite structures over the reals. SIAM Journal on Computing, 27 (1998),
1747–1763.
198. A. Pillay and C. Steinhorn. Deﬁnable sets in ordered structures. III. Transactions
of the American Mathematical Society, 309 (1988), 469–476.
199. E. Pezzoli. Computational complexity of Ehrenfeucht-Fra¨ıss´e games on ﬁnite
structures. Computer Science Logic 1998, Springer-Verlag, LNCS 1584, pages
159–170.
200. B. Poizat. Deux ou trois choses que je sais de Ln. Journal of Symbolic Logic,
47 (1982), 641–658.
References 301
201. B. Poizat. A Course in Model Theory: An Introduction to Contemporary Mathematical
Logic. Springer-Verlag, 2000.
202. M. Rabin. Decidability of second-order theories and automata on inﬁnite trees.
Transactions of the American Mathematical Society, 141 (1969), 1–35.
203. R. Rado. Universal graphs and universal functions. Acta Arithmetica, 9 (1964),
331–340.
204. N. Robertson and P. Seymour. Graph minors V. Excluding a planar graph.
Journal of Combinatorial Theory, Series B, 41 (1986), 92–114.
205. N. Robertson and P. Seymour. Graph minors XIII. The disjoint paths problem.
Journal of Combinatorial Theory, Series B, 63 (1995), 65–110.
206. J. Robinson. Deﬁnability and decision problems in arithmetic. Journal of
Symbolic Logic, 14 (1949), 98–114.
207. E. Rosen. Some aspects of model theory and ﬁnite structures. Bulletin of
Symbolic Logic, 8 (2002), 380–403.
208. E. Rosen and S. Weinstein. Preservation theorems in ﬁnite model theory. In
Logic and Computational Complexity, Springer-Verlag LNCS 960, 1994, pages
480–502.
209. J. Rosenstein. Linear Orderings. Academic Press, 1982.
210. B. Rossman. Successor-invariance in the ﬁnite. In IEEE Symp. on Logic in
Computer Science, 2003, pages 148–157.
211. Y. Sagiv and M. Yannakakis. Equivalences among relational expressions with
the union and diﬀerence operators. Journal of the ACM, 27 (1980), 633–655.
212. V. Sazonov. Polynomial computability and recursivity in ﬁnite domains. Elektronische
Informationsverarbeitung und Kybernetik, 16 (1980), 319–323.
213. T. Schaefer. The complexity of satisﬁability problems. In Proc. 10th Symp. on
Theory of Computing, 1978, pages 216–226.
214. K. Schneider. Veriﬁcation of Reactive Systems. Springer-Verlag, 2004.
215. T. Schwentick. On winning Ehrenfeucht games and monadic NP. Annals of
Pure and Applied Logic, 79 (1996), 61–92.
216. T. Schwentick. Descriptive complexity, lower bounds and linear time. In Proc.
of Computer Science Logic, Springer-Verlag LNCS 1584, 1998, pages 9–28.
217. T. Schwentick and K. Barthelmann. Local normal forms for ﬁrst-order logic
with applications to games and automata. In Proc. 15th Symp. on Theoretical
Aspects of Computer Science (STACS’98), Springer-Verlag, 1998, pages 444–
454.
218. D. Seese. The structure of models of decidable monadic theories of graphs.
Annals of Pure and Applied Logic, 53 (1991), 169–195.
219. D. Seese. Linear time computable problems and ﬁrst-order descriptions. Mathematical
Structures in Computer Science, 6 (1996), 505–526.
220. O. Shmueli. Decidability and expressiveness of logic queries. In ACM Symp.
on Principles of Database Systems, 1987, ACM Press, pages 237–249.
221. M. Sipser. Introduction to the Theory of Computation. PWS Publishing, 1997.
222. L. Stockmeyer. The complexity of decision problems in automata and logic.
PhD Thesis, MIT, 1974.
302 References
223. L. Stockmeyer. The polynomial-time hierarchy. Theoretical Computer Science,
3 (1977), 1–22.
224. L. Stockmeyer and A. Meyer. Cosmological lower bound on the circuit complexity
of a small problem in logic. Journal of the ACM, 49 (2002), 753–784.
225. H. Straubing. Finite Automata, Formal Logic, and Circuit Complexity.
Birkh¨auser, 1994.
226. R. Szelepcs´enyi. The method of forced enumeration for nondeterministic automata.
Acta Informatica, 26 (1988), 279–284.
227. V.A. Talanov and V.V. Knyazev. The asymptotic truth value of inﬁnite formulas
(in Russian), Proc. All-Union seminar on discrete mathematics and its
applications, Moscow State University, Faculty of Mathematics and Mechanics,
1986, pages 56–61.
228. R. Tarjan and M. Yannakakis. Simple linear-time algorithms to test chordality
of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs.
SIAM Journal on Computing, 13 (1984), 566–579.
229. A. Tarski. A Decision Method for Elementary Algebra and Geometry. Univ.
of California Press, 1951. Reprinted in Quantiﬁer Elimination and Cylindrical
Algebraic Decomposition, B. Caviness and J. Johnson, eds. Springer-Verlag,
1998, pages 24–84.
230. J. Thatcher and J. Wright. Generalized ﬁnite automata theory with an application
to a decision problem of second-order logic. Mathematical Systems
Theory, 2 (1968), 57–81.
231. W. Thomas. Classifying regular events in symbolic logic. Journal of Computer
and System Sciences, 25 (1982), 360–376.
232. W. Thomas. Logical aspects in the study of tree languages. In Proc. 9th
Int. Colloq. on Trees in Algebra and Programming (CAAP’84), Cambridge
University Press, 1984, pages 31–50.
233. W. Thomas. Languages, automata, and logic. In Handbook of Formal Languages,
Vol. 3, Springer-Verlag, 1997, pages 389–455.
234. B. A. Trakhtenbrot.The impossibilty of an algorithm for the decision problem
for ﬁnite models (in Russian), Doklady Academii Nauk SSSR, 70 (1950), 569–
572.
235. E. Tsang. Foundations of Constraint Satisfaction. Academic Press, 1993.
236. G. Tur´an. On the deﬁnability of properties of ﬁnite graphs. Discrete Mathematics,
49 (1984), 291–302.
237. J. V¨a¨an¨anen. Generalized quantiﬁers. Bulletin of the EATCS, 62 (1997), 115–
136.
238. J. V¨a¨an¨anen. Unary quantiﬁers on ﬁnite models. Journal of Logic, Language
and Information, 6 (1997), 275–304.
239. J. V¨a¨an¨anen. A Short Course in Finite Model Theory. University of Helsinki.
44pp. Available at www.math.helsinki.fi/logic/people/jouko.vaananen.
240. J. van Benthem. Modal Logic and Classical Logic. Bibliopolis, 1983.
241. D. Van Dalen. Logic and Structure. Springer-Verlag, 1994.
242. J. Van den Bussche. Constraint databases: a tutorial introduction. SIGMOD
Record, 29 (2000), 44–51.
References 303
243. L. van den Dries. Tame Topology and O-Minimal Structures. Cambridge University
Press, 1998.
244. M.Y. Vardi. The complexity of relational query languages. In Proc. ACM
Symp. on Theory of Computing, 1982, 137–146.
245. M.Y. Vardi. On the complexity of bounded-variable queries. In ACM Symp.
on Principles of Database Systems, ACM Press, 1995, pages 266–276.
246. M.Y. Vardi. Why is modal logic so robustly decidable? In [134], pages 149–183.
247. H. Vollmer. Introduction to Circuit Complexity. Springer-Verlag, 1999.
248. A.J. Wilkie. Model completeness results for expansions of the ordered ﬁeld
of real numbers by restricted Pfaﬃan functions and the exponential function.
Journal of the American Mathematical Society, 9 (1996), 1051–1094.
249. M. Yannakakis. Algorithms for acyclic database schemes. In Proc. Conf. on
Very Large Databases, 1981, pages 82–94.
250. M. Yannakakis. Perspectives on database theory. In IEEE Symp. on Foundations
of Computer Science, 1995, pages 224–246.
List of Notation
∃! 5
σ 13
A 13
R 13
STRUCT[σ] 14
FO 14
(A, A′
) 15
σn 16
∼= 17
ϕ(A) 17
ϕ(A, a) 17
Σ∗
17
ǫ 17
DTIME 19
NTIME 19
Ptime 19
NP 19
Pspace 20
NLog 20
DLog 20
PH 20
Σp
i 20
Πp
i 20
even 24
≡n 28
min 29
max 29
qr 32
FO[k] 32
tpk(A, a) 34
≃k 36
parity 41
BA
r 46
NA
r 46
⇆ 47
hlr 47
lr 49
≈ 50
degree 55
deg set 55
STRUCTl[σ] 55
⇆thr
d,m 61
(FO + C)inv 69
(L+ <)inv 69
SA
r 74
enc(A) 88
A 88
C 89
C 90
All 91
FO(+++,×××) 95
BIT 95
⋊⋉ 103
H(ϕ) 105
CQk 107
SO 113
MSO 115
∃SO 115
∃MSO 115
∀SO 115
∀MSO 115
MSO[k] 116
mso-tpk 116
≡MSO
117
FO(Cnt) 142
∃ix 142
FO(Q) 144
L∞ω 145
145
145
L∞ω(Cnt) 146
#x.ϕ 146
rk 146
L∗
∞ω(Cnt) 147
≡bij
151
304
LIST OF NOTATION 305
Laggr 159
℘ 178
lfp 178
ifp 179
pfp 180
Fϕ 180
IFP 180
PFP 180
LFP 181
LFPsimult
185
|ϕ|A
189
|a|A
ϕ 189
∃LFP 197
TrCl 199
trcl 199
posTrCl 200
FOk
212
Lk
∞ω 212
Lω
∞ω 212
PG 215
dom 218
rng 218
Iβ 218
Iα 218
tpFOk 220
A≤k
222
ϕm
a (x) 222
≈FOk 225
≺FOk 226
ck 229
Ck 229
Grn 235
µn(P) 235
µ(P) 235
EAn,m 238
EAk 238
RG 241
EA 241
∃SO(r) 243
adom 250
FO(M, σ) 250
∃x∈adom 250
∀x∈adom 250
FOact(M, σ) 251
N 253
RQC 255
FOgen
(M, σ) 256
R 260
Rlin 267
K 278
♦ϕ 279
ϕ 279
LTL 280
X 280
U 280
CTL 280
E 280
A 280
CTL∗
280
µx.ϕ 283
CSP 286
Index
Ackermann class 275
Active domain 250
formula 251
quantiﬁer 251
Aggregate
logic 159
expressiveness of 160
operator 159
Almost everywhere equivalence
245
Arity hierarchy 176
Asymptotic probability 235
of connectivity 236
of even and parity 237
of extension axioms 238
Automaton
and MSO 124
deterministic 17
nondeterministic 17
tree (ranked) 130
tree (unranked) 133
Back-and-forth 36
k 218
Ball 46
Bernays-Sch¨onﬁnkel class 275
BNDP (bounded number of degrees
property) 55
Boolean combination 15
Capturing
complexity class 168
coNP 169
DLog 208
NLog 200, 208
NP 169
PH 173
Pspace 194
Ptime 192, 208
Circuit
Boolean 89
family of 90
uniform 95
majority, or threshold 155
Class of structures
MSO-inductive 140
of bounded treewidth 110,135
of small degree 55
Collapse
active-generic 256,257
natural-active 255
restricted quantiﬁer 255
and VC dimension 273
fails over integers 260
for the real ﬁeld 261
to MSO 265
Combined complexity
of conjunctive queries 104
of FO 99
of LFP 207
of MSO 139
Completeness
fails over ﬁnite models 166
of games for FO 35
of games for MSO 117
Complexity
combined 88
data 88
expression 88
ﬁxed-parameter linear 100
ﬁxed-parameter tractable 100
parameterized 100
Complexity class
AC0
91
capturing of 168
coNP 20
DLog 20
Nexptime 21
Nlin 139
NLog 20
NP 19
306
INDEX 307
PH 20
Ptime 19
TC0
155
Composition method
for FO 30–31,42
for MSO 118,140
Conjecture
Crane Beach 273
Gurevich’s 204
McColm’s 210,234
Conjunctive query (CQ) 102
acyclic 105
combined complexity of 104
containment of 111
evaluation of 106,107, 110,111
union of 277
Connective
Boolean 15
inﬁnitary 145
Connectivity 23
and L∗
∞ω(Cnt) 153
and embedded ﬁnite models
254,260, 265
and FO 23, 37
and Hanf-locality 48
and MSO 120
topological 268, 272
Constraint satisfaction 285–288
and bounded treewidth 288
and conjunctive queries 286
and homomorphism 286
dichotomy for 287
Data complexity
of FO 92
of FO(Cnt) 155
of LFP 194
of MSO 134
over strings and trees 135
of µ-calculus 284
of temporal logics 281
of TrCl 200–203
Database
constraint 267–270
relational 1–4
Datalog 196
and existential least ﬁxed point
logic 197
and Ptime 199
monotonicity of 197
with negation 196
Duplicator 26
Encoding
of formulae 87
of structures 88
Extension axioms 238
and random graph 241
and zero-one law 240,244
asymptotic probability of 238
using in collapse results 265
Extensional predicates 196
Failure in the ﬁnite
Beth’s theorem 42
compactness theorem 24
completeness theorem 166
Craig’s theorem 42
L¨owenheim-Skolem theorem
166
Los-Tarski theorem 42
Finite variable logic (Lω
∞ω)
and ﬁxed point logics 214
and pebble games 216
deﬁnition of 212
First-order logic (FO) 14
expressive power of 28–31,37–
40
games for 32
Fixed-parameter linearity
of acyclic conjunctive queries
106
of FO for small degrees 101
of MSO and bounded treewidth
135
of MSO over strings and trees
135
of temporal logics 281
Fixed-parameter tractability
and bounded treewidth 110
308 INDEX
of FO on planar graphs 102
Fixed point 178
inﬂationary 179
least 178
partial 180
simultaneous 184
stages of 184, 186,188
FO with counting (FO(Cnt))
142
Formula
atomic 14
C-invariant 68
Hintikka 40
quantiﬁer-free 14
FPL 100
FPT 100
Gaifman graph 45
Gaifman-locality 48
Game
Ajtai-Fagin 123
and ∃MSO 123
bijective 59, 151
and L∗
∞ω(Cnt) 151
Ehrenfeucht-Fra¨ıss´e 26
for FO 26
for MSO 116
Fagin 122
pebble 215
and Lω
∞ω 216
Halting problem 19, 166
Hanf-locality 47
Hypergraph 105
tree decomposition of 105–108
Inexpressibility of
connectivity
in FO(All) 94
in ∃MSO 120
in L∗
∞ω(Cnt) 153
of arbitrary graphs in FO 23
of ﬁnite graphs in FO 37, 52
using Hanf-locality 48
even
in ﬁxed point logics 217
in FO 25
in Lω
∞ω 217
in MSO 118
of ordered sets 28
Hamiltonicity
in MSO 126
parity in FO(All) 94
Inﬂationary ﬁxed point logic (IFP)
180
Intensional predicates 196
Isomorphism 17
partial 27
with the k-back-and-forth
property 218
Join 102
Kripke structure 278
bisimilarity of 284
Language 17
regular 18
and MSO 124
star-free 127
and FO 127
Least ﬁxed point logic (LFP) 181
Linear order
aﬀects expressive power
69, 119,150, 153, 214
deﬁnability of 227
FO deﬁnability of 28–31
Locality
of aggregate logic 160
of FO 52
of L∗
∞ω(Cnt) 153
of order-invariant FO 73
Locality rank 49
bounds on 54,64
Hanf 47
Logic
aggregate 159
CTL 280
CTL∗
280
existential ﬁxed point 197
INDEX 309
ﬁnite variable 212
ﬁrst-order 14
FO with counting 142
inﬁnitary 145,212
inﬂationary ﬁxed point 180
least ﬁxed point 181
Lω
∞ω 212
L∗
∞ω(Cnt) 147
LTL 280
monadic second-order 115
µ-calculus 283
partial ﬁxed point 180
propositional modal 279
second-order 113
existential (∃SO) 115
universal (∀SO) 115
SO-Horn 208
SO-Krom 208
transitive closure 199
Model 13
embedded ﬁnite 250
ﬁnite 13
Model-checking problem
87, 100,281
Monadic second-order logic (MSO)
115
existential (∃MSO) 115
equals MSO over strings 126
universal (∀MSO) 115
diﬀerent from ∃MSO 120
µ-calculus (Calcµ) 283
Neighborhood 46
Normal form
for LFP 192,194
for SO 115
for TrCl 201
Occurrence
negative 181
positive 181
Operator 178
based on a formula 180
inductive 178
Order invariance 69
separation results for
ﬁxed point logics 217
FO 69
FO(Cnt) 158
Lω
∞ω 214
L∗
∞ω(Cnt) 153
MSO 119
undecidability of 174
Ordered conjecture 210
Partial ﬁxed point logic (PFP)
180
Polynomial hierarchy 20
and MSO 134
capturing of 173
Polynomial time 19
capturing of
in ∃SO 208
over ordered structures 192
over unordered structures
204–205
Projection 103
Property
bisimulation-invariant 285
ﬁnite model 276
and satisﬁability 276–278
Ramsey 257
and collapse 259
Propositional modal logic (ML)
279
Quantiﬁer
active domain 251
counting 141
existential 14
generalized
and Ptime 204
H¨artig 144
Rescher 144
unary 144
preﬁx 173,175,243, 275
rank 32
second-order 114
universal 14
310 INDEX
unrestricted 251
Quantiﬁer elimination
and collapse results 255
for the random graph 247
for the real ﬁeld 261
Query 17
Boolean 17
complexity of 88
conjunctive 102
deﬁnable in a logic 17
Gaifman-local 48
Hanf-local 47
invariant 68
order-invariant 69
weakly local 73
r.e. see Recursively enumerable
Random graph 241
and quantiﬁer elimination 247
collapse over 265
representations of 248
theory of 242
Rank
in L∗
∞ω(Cnt) 146
quantiﬁer 32
for unary quantiﬁers 144
in FO 32
in SO 115
Reachability 2, 122
and Gaifman-locality 49
for directed graphs in ∃MSO
122
for undirected graphs in ∃MSO
122
Recursive 19
Recursively enumerable 19
RQC (restricted quantiﬁer collapse)
255
Satisﬁability
for Ackermann class 277
for Bernays-Sch¨onﬁnkel class
276
for FO2
278
Second-order logic (SO) 113
Selection 103
Sentence 15
atomic 33
ﬁnitely satisﬁable 165
ﬁnitely valid 165
quantiﬁer rank of 32
satisﬁable 16
valid 16
Simultaneous ﬁxed point 184
elimination of 186
Sphere 74
Spoiler 26
Structure 13
canonical for FOk
229
Kripke 278
rigid 234
k-rigid 227
Symbol
constant 13
function 13
relation 13
Term 14
counting 146
Theorem
Abiteboul-Vianu 230
Ajtai’s 94
Beth’s 42
B¨uchi’s 124
compactness 16
completeness 16
Cook’s 173
Courcelle’s 135
Craig’s 42
Ehrenfeucht-Fra¨ıss´e 32
Fagin’s 169
Furst-Saxe-Sipser 94
Gaifman’s 60
Grohe-Schwentick 73
Gurevich’s 69
Gurevich-Shelah 191
Immerman–Szelepcs´enyi 200
Immerman-Vardi 192
L¨owenheim-Skolem 16
Los-Tarski 42
INDEX 311
Lyndon’s 43
Ramsey’s 257
Stage comparison 189
Tarski-Knaster 179
Trakhtenbrot’s 165
Theory 16
complete 242
consistent 16
decidable 242
ω-categorical 242
Threshold equivalence 61
Transitive closure 3, 17
expressible in Datalog 196
expressible in ﬁxed point logics
182
inexpressible in aggregate logic
160
inexpressible in FO 52
violates locality 49
Transitive closure logic (TrCl)
199
positive 200
Tree 129
automata 130
decomposition 105, 107
regular languages and MSO
131
unranked 132
automata 133
regular and MSO 133
Treewidth 107
bounded 108, 110,135, 140
Turing machine 18
and logic 166–168,170–
172,193–194,201
deterministic 18
time and space bounds 19
Type
atomic 226
FOk
220
expressibility of 221–225
ordering of 227–229
in L∗
∞ω(Cnt) 152
rank-k, FO 34
expressibility of 35
ﬁnite number of 35
rank-k, MSO 116
and automata 125–126
expressibility of 116
Variable
bound 15
free 14
Vocabulary 13
purely relational 14
relational 14
Zero-one law 237
and extension axioms 240
failure for MSO 247
for Lω
∞ω 237
for FO and ﬁxed point logics
237
for fragments of SO 243–245
Name Index
Abiteboul, S.
VIII, 206, 207,229, 230,232, 246,
288
Afrati, F. 207
Aho, A. VII
Ajtai, M.
94, 108,123, 136,174,206
Asser, G. 174
Barrington, D. A. M.
108,161, 271
Barthelmann, K. 63
Barwise, J. 232
Benedikt, M. 137,270
Blass, A. 246,247
Blumensath, A. 137
Bodlaender, H. 137
B¨orger, E. 288
Bruy`ere, V. 137
B¨uchi, J. VIII, 11, 124,136
Bulatov, A. 289
Buss, S. 108
Cai, J. 206
Cameron, J. 246
Chandra, A. VII, 108, 109,206
Chang, C.C. 21
Chapuis, O. 271
Clarke, E. 288
Compton, K. 246
Cook, S. A. 40, 108,174
Cosmadakis, S. 137, 207
Courcelle, B. 135,137
Dalmau, V. 289
Dawar, A. 109, 206,207, 232,233
Denenberg, L. 108
de Rougemont, M. 136,233
Dong, G. 63
Downey, R. 108
Ebbinghaus, H.-D.
VIII, 21,40, 83, 136, 206
Ehrenfeucht, A. 26, 32, 40
Eiter, T. 174
Emerson, E. A. 288
Enderton, H. 21
Erd¨os, P. 246
Etessami, K. 161, 288
Fagin, R.
VII, 6, 62, 120,122,123, 136,165,
168–174,193–
195,200, 204,246
Feder, T. 40, 289
Feferman, S. 137,232
Fellows, M. 108
Flum, J. VIII, 21, 40, 83, 108–
110,136, 206, 271
Fournier, H. 271
Fra¨ıss´e, R. 26, 32, 40
Frick, M. 108, 109,137
F¨urer, M. 206
Furst, M. 94, 108
Gaifman, H. 40, 45, 48, 63, 246
Gire, F. 206
Glebskii, Y. 246
Gottlob, G.
108,109, 174,207, 288
Gr¨adel, E. 137,161, 174,206,288
Graham, R. 270
Grandjean, E. 137,246
Grohe, M. 73, 83, 108–
110,137, 206,207, 233, 289
Grumbach, S. 270
Grumberg, O. 288
Gurevich, Yu.
40, 69, 73, 83, 108,161,174,
191,192, 204,206, 228,246,247, 288
Hanf, W. 47,62
312
NAME INDEX 313
Harel, D. VII, 206
Hella, L. 63,161, 206,207, 246
Herr, L. 288
Hoang, H. 206
Hodges, W. 21,246
Hopcroft, J. 11, 21
Hull, R. VIII, 206, 271
Immerman, N.
VII, 108, 109,161, 192,195,200,
206,226, 232,271, 288
Janin, D. 136,289
Johnson, D. 21
Jones, N. 174
Kamp, H. 288
Kanellakis, P. 136,270
Karp, C. 246
Kaufmann, M. 246
Keisler, H. J. 21
Khoussainov, B. 21
Kleene, S. 174
Knaster, B. 179
Knyazev, V. 246
Koch, C. 207
Koiran, P. 271
Kolaitis, Ph.
161,174, 206,210, 232,233,
246,247, 288,289
Kozen, D. 246, 247
Kuijpers, B. 271
Kuper, G. 270
Ladner, R. 136,289
Lander, E. 161
Lautemann, C. 174, 271
Le Bars, J.-M. 246, 247
Leivant, D. 206
Leone, N. 108, 109
Libkin, L. 63, 83, 137,161, 270
Lindell, S. 207, 232,233
Livchak, A. B. 206
Luosto, K. 246
Lynch, J. 137, 246
Lyndon, R. 43,207
Makowsky, J. 40, 136, 137,174
Marcinkowski, J. 136, 207
Matz, O. 137
McColm, G. 206,210, 233
McNaughton, R. 136
Merlin, P. 108, 109
Meyer, A. 137
Moller, F. 289
Mortimer, M. 288
Moschovakis, Y. 206
Nerode, A. 21
Neven, F. 136,137
Nurmonen, J. 63, 161
Olive, F. 137
Otto, M.
83, 137,206, 207,232,270
Papadimitriou, C.
21, 108,109, 206
Papert, S. 136
Paredaens, J. 270, 271
Peled, D. 288
Pezzoli, E. 40
Pillay, A. 272
Pnueli, Y. 174
Poizat, B. 21, 232
Rabin, M. 140
Rabinovich, A. 289
Rado, R. 246
R´enyi, A. 246
Revesz, P. 270
Robertson, N. 110, 140
Robinson, J. 271
Rosen, E. 40
Rosenstein, J. 40
Rossman, B. 83
Rothschild, B. 270
Sagiv, Y. 288
Saxe, J. 94, 108
Sazonov, V. 206
Scarcello, F. 108,109
Schaefer, T. 289
314 NAME INDEX
Schweikardt, N. 137,271
Schwentick, T.
63, 73, 83, 108,109,136, 137,
174
Seese, D. 108,137
Segouﬁn, L. 108, 109
Selman, A. 174
Seymour, P. 110,140
Shelah, S.
108,191, 192,206, 207,228,246
Shmueli, O. 207
Sipser, M. 21, 94, 108
Spencer, J. 270
Steinhorn, C. 272
Stockmeyer, L.
62, 108,136, 137,174
Straubing, H. 108,161
Su, J. 270
Szelepcs´enyi, R. 200,206
Talanov, V. 246
Tarjan, R. 108
Tarski, A. 179,271
Thatcher, J. 137
Th´erien, D. 174,271
Thomas, W. VIII, 21, 136,137
Trakhtenbrot, B.
VII, 165, 166,170, 171,174,
193,195
Tur´an, G. 136
Ullman, J. D. VII, 11, 21
V¨a¨an¨anen, J. VIII, 161
van Benthem, J. 288
van Dalen, D. 21
Van den Bussche, J. 270, 271,288
Van Gucht, D. 270
Vardi, M. Y.
VII, 40, 62, 108,136, 192,195,
200,206, 210,226, 232,233,246, 247,
288,289
Vaught, R. 137
Veith, H. 288
Vianu, V.
VIII, 206, 207,229, 230,232, 246
Vollmer, H. 109,161
Walukiewicz, I. 289
Weinstein, S. 40, 207,232,233
Wilke, T. 288
Wilkie, A. 272
Wong, L. 63, 83
Wright, J. 137
Yannakakis, M.
108,109, 206,207, 288
Ziegler, M. 271