Efﬁcient Implementation of Property Directed
Reachability
Niklas Een, Alan Mishchenko, Robert Brayton
{een,alanmi,brayton}@eecs.berkeley.edu
Berkeley Veriﬁcation and Synthesis Research Center
EECS Department
University of California, Berkeley, USA.
Abstract—Last spring, in March 2010, Aaron Bradley published
the ﬁrst truly new bit-level symbolic model checking algorithm
since Ken McMillan’s interpolation based model checking procedure
introduced in 2003. Our experience with the algorithm
suggests that it is stronger than interpolation on industrial problems,
and that it is an important algorithm to study further. In
this paper, we present a simpliﬁed and faster implementation of
Bradley’s procedure, and discuss our successful and unsuccessful
attempts to improve it.
I. INTRODUCTION
Sequential veriﬁcation is hard, both model checking and equivalence
checking. Difﬁcult instances are typically solved using
several simpliﬁcation steps followed by multiple veriﬁcation
engines scheduled sequentially or concurrently. Despite all the
available tools, numerous practical instances remain unsolved.
Therefore, research in formal veriﬁcation is always on the
lookout for methods that can handle difﬁcult cases.
In 2003, a new veriﬁcation method based on interpolation
[7], was proposed by Ken McMillan to address hard UNSAT
instances. Over time it was perfected and is currently considered
one of the most valuable formal veriﬁcation methods.
More recently, another novel method was pioneered by
Aaron Bradley [1], [2]. He named his implementation IC3,
but gave no name to the method itself. We choose to call it
property directed reachability (PDR) to connect it to Bradley’s
earlier work on property directed invariant generation.
It came as a surprise that IC3 won the third place in the hardware
model checking competition (HWMCC) at CAV 2010.
It was marginally outperformed by two mature integrated
veriﬁcation systems, both carefully tuned to balance several
different engines. As such, the new method appears to be the
most important contribution to bit-level formal veriﬁcation in
almost a decade.
Although PDR has been generally known for less than a
year, while interpolation has been around long enough for
numerous improvements and extensions to be proposed, for
example [4], [3], an up-to-date implementation of PDR can
solve more instances from HWMCC than interpolation can.
This is also true for the benchmarks our group has received
from our industrial collaborators. Another remarkable property
of PDR is its capability of ﬁnding deep counterexamples.
Although on average BMC does better than PDR, there are
many benchmarks where PDR can ﬁnd counterexamples that
elude both BMC and BDD reachability. Finally, PDR lends
itself naturally to parallel implementation, as was explained
in Bradley’s original work.
In this paper, we explore PDR and try to understand the
reason for its effectiveness. We propose a number of changes
to the algorithm to improve its performance and to simplify
its implementation. In particular:
– We achieve a signiﬁcant speedup by using three-valued
simulation to reduce the burden on the SAT-solver.
– We eliminate a tedious and error-prone special-case handling
of counterexamples of length 0 or 1.
– We show experimentally that two elements of the original
algorithm give no speedup: (i) variable activity and (ii)
cube generalization beyond non-inductive regions.
– We separate the main algorithm from the handling of
SAT queries through a clean interface. This separation
reduces the overall complexity.
– We refute some potential improvements experimentally.
– We present detailed pseudo-code to fully document our
implementation.
II. PRELIMINARIES
This paper considers the veriﬁcation of systems modeled
using ﬁnite state machines (FSMs). Each state of the FSM is
identiﬁed with a boolean assignment to a set of state variables.
The FSM further deﬁnes a set of initial states and a set of
property states. The algorithm to be presented veriﬁes that
there exists no sequence of transitions from an initial state to
a non-property state (“bad” state).
In the presentation, a state variable or its negation is referred
to as a literal, a conjunction of literals as a cube, and the
negation of a cube (a disjunction of literals) as a clause. If a
cube contains all the state variables, it is called a minterm. It
is assumed that the FSM is represented symbolically in a way
that can be translated into propositional logic for a SAT-solver.
III. OVERVIEW OF PDR
The PDR algorithm can be understood on several levels. This
section addresses:
(1) How it works.
(2) Why it is complete.
(3) What makes it effective.
FMCAD 2011, Page 125
In particular the ﬁrst two points can be understood in terms
of approximate reachability analysis. For the third point one
must also consider the inductive ﬂavor of the algorithm.
A. Notation
Let I and P be predicates over the FSM’s state variables,
denoting the initial states and the property states respectively.
Also let T denote the transition relation over the current and
next state variables. Given a cube s, a call to the underlying
SAT solver will be expressed as:
SAT? P ∧ ¬s ∧ T ∧ s′
using primes to denote next states. This query asks “starting
in a state where the property holds, but outside the cube s,
can you get inside the cube s in one transition”. If the answer
is UNSAT, then it has been proved that (P ∧ ¬s ∧ T) → ¬s′
,
and ¬s is said to be inductive relative to P.
In the algorithm, some cubes will be proved unreachable
from the initial states in k steps or less. Such cubes will be
referred to as blocked cubes of frame k.
B. Mechanics
PDR maintains a list of facts which we will call the trace: [R0,
R1, . . . RN ]. The ﬁrst element R0 is special; it is simply
identiﬁed with the initial states. For k > 0, Rk is a set of
clauses that AND-ed together represent an over-approximation
of the states reachable from the initial states in k steps or less.
The trace is maintained in such a way that Ri is contained in
Ri+1. In fact, this relation is syntactic: every clause of Ri+1
is also present in Ri, except for i = 0 (R0 has no clauses).
Together with the trace, the PDR algorithm maintains a set
of proof-obligations. A proof-obligation consists of a frame
number k and a cube s, where s is either a set of bad states
or a set of states that can all reach a bad state in one or more
transitions. The frame number k indicates a position in the
trace where s must be proved unreachable, or else the property
fails.
By manipulating the trace and the set of proof-obligations
according to a scheme detailed below, PDR derives new facts
and adds them to the trace until it either (i) produced an
inductive invariant proving the property, or (ii) added a proofobligation
at frame 0 with a cube that intersects the initial
states. Such a cube cannot be blocked and entails the existence
of a counterexample.
(1) PROOF-OBLIGATIONS: The core of PDR lies in how
proof-obligations are handled, and how new facts are derived
from them. All reasoning in PDR take place on one transition
relation; there is no unrolling of the FSM as in, e.g., BMC.
Given the proof-obligation (s, k), consider the query:
SAT? Rk−1 ∧ T ∧ s′
(Q1)
If it is UNSAT, then the facts of Rk−1 are strong enough
to block s at frame k, and we can add the clause ¬s to
Rk. However, the syntactic containment relation of the trace
requires us also to add the same clause to all preceding Ri,
i < k. Is it sound to do this? Consider replacing Rk−1 with
Rk−2 in the query. Containment states that Rk−2 is stronger
than Rk−1, so the query remains UNSAT. Likewise for Rk−3
and so on, all the way back to the initial states. The only thing
left to check is whether s intersects the initial states or not. If
s is not blocked by R0, then we cannot strengthen the trace by
¬s. In the algorithm, this query will not be used if s overlaps
with the initial states.
Using this approach, the quality of the learned clause
depends on the size of the cube in the proof-obligation. In
practice, these cubes often have many literals, and the negation
¬s is a weak fact. It turns out to be crucial for the performance
of PDR to try to learn stronger facts, i.e. cubes with fewer
literals. To achieve this, the above learning scheme is improved
in two ways:
Improvement 1 – “Generalize s”. Many modern SAT-solvers
do not simply return UNSAT, but also give a reason for the
unsatisﬁability; either through an UNSAT-core or through a
ﬁnal conﬂict-clause. Both these mechanisms can be used to
extract precise information about which clauses were actually
used in the proof. Since s is a conjunction inside the query,
it translates into a set of unit clauses. Not all of those clauses
may actually be needed when proving UNSAT. Any literal of
s corresponding to an unused clause can be removed without
affecting the UNSAT status. This provides a virtually free
mechanism of removing literals that just happen not to be
used by the SAT-solver.
A more directed, but also more expensive, approach is to
explicitly try to remove the literals one by one. If the query
remains UNSAT in the absence of a literal, good riddance. If
not, put the literal back. Although the order in which literals
are probed affects the outcome, the procedure is monotone
in the sense that removing a literal cannot make a satisﬁable
query UNSAT. Note that we cannot remove a literal if it makes
s intersect with the initial states, even if the query is UNSAT.
Improvement 2 – “Add ¬s to the query”. A key insight of
Bradley was to realize that the query could be extended by
the term ¬s:
SAT? Rk−1 ∧ ¬s ∧ T ∧ s′
(Q2)
Adding an extra conjunct means the query is more likely to
be UNSAT, which improves chances of removing a literal, or
indeed learning a clause at all. This extended query is depicted
in Figure 1. Having s on both sides of the transition breaks
monotonicity: as s gets weaker, ¬s gets stronger. A query that
is SAT may become UNSAT if more literals are removed—
which makes the task of ﬁnding a minimal cube much harder
(exponential in the size of s). Heuristics for minimizing s are
discussed in [2].
But why is it sound to add ¬s to the query? It can be viewed
as a bounded inductive reasoning: The base case R0 → ¬s
holds by construction (s does not intersect the initial states).
We have proved that (Rk−1 ∧ ¬s ∧ T) → ¬s′
, but because
Ri is stronger than Rk−1 for i < k − 1, we have also proved
that ¬s is preserved by every transition up to frame k.
FMCAD 2011, Page 126
Figure 1. Is s inductive relative to Rk−1? In the SAT query, we try
to ﬁnd a minterm m in the white region of the ﬁrst frame, that in
one transition can reach a point inside the cube s. The white region
satisﬁes Rk−1 ∧¬s, illustrated by the four blocked cubes c1 through
c4 and the cube s. If the query is UNSAT, it has been proved that
a point outside s stays outside s for the ﬁrst k transitions from the
initial states. When generalizing s, we must make sure that the cube
does not grow to intersect the initial states. This property, together
with UNSAT, proves s to be unreachable in the ﬁrst k frames. Note
that the ﬁgure also illustrates how Rk contains a subset of the cubes
of Rk−1.
(2) SATISFIABLE QUERIES: We now turn to the case where
the query (original or extended) is SAT. This means Rk−1 was
not strong enough to block s at frame k, and something new
must be learned at frame k−1. From the satisfying assignment,
we can extract a minterm m in the pre-image of s, which gives
us a new proof-obligation (m, k − 1).
The above learning scheme can now be applied to this proofobligation,
drawing from Rk−2 to learn clauses in Rk−1. If
Rk−2 is not strong enough, the procedure may recursively go
further back into the trace and learn a whole cascade of facts
over many time-frames. Eventually the procedure returns to
the original proof-obligation (s, k) and may either succeed in
blocking it this time, or generate a new minterm in the preimage
of s.
As noted in the previous section, learning short clauses is
crucial for PDR to work. Indeed, most of the runtime is spent
on generalizing cubes by removing literals. Because a minterm
is maximally long, it is a particularly undesirable starting point
for this process. To alleviate this situation, we propose to
shrink the proof-obligations by using three-valued (ternary)
simulation.1
It requires the FSM to be in circuit form, but in
practice this is often the case.
Reducing proof-obligations by ternary simulation. For a
satisﬁable query, extract the minterm m from the satisfying
assignment, giving values to the ﬂop outputs as well as the
primary inputs. Simulate this assignment through one timeframe.
Now, probe each ﬂop by changing its value to X and
propagate the effect of this using ternary simulation. If an X
does not appear at any ﬂop input among the ﬂops in s, then
the probed ﬂop (state variable) can be safely removed from
the proof-obligation. If the X do reach a ﬂop in s, undo the
propagation and the probing, and move on to the next ﬂop.
The resulting cube has the property that all the states it
represents can reach s in one transition, and hence the entire
1Ternary logic has three values: 0, 1, and X. The binary semantics is
extended by: (X ∧ 0 = 0), (X ∧ 1 = X), (X ∧ X = X), (¬X = X).
cube must be blocked.
C. The Algorithm
For clarity, we state the precise properties of the trace:
(1) R0 = I.
(2) All Ri except R0 are sets of clauses.
(3a) Ri → Ri+1.
(3b) The clauses Ri+1 is a subset of Ri for i > 0.
(4) Ri+1 is an over-approximation of the image of Ri.
(5) Ri → P, except for the last element RN of the trace.
We note that (5) is different from Bradley’s original presentation,
which also required the property to hold for RN . The
change eliminates the need for the special BMC check of
length 0 and 1, performed in Bradley’s implementation of
PDR.
At the start of the algorithm the trace has just one element
R0. It then runs the following main loop:
while SAT? RN ∧ ¬P do
(a) extract a bad state m from the SAT model
(b) generalize m to a cube s using ternary simulation
(c) recursively block the proof-obligation (s, k)
When the loop terminates, the property holds for RN , and
an empty frame is added to the trace. The algorithm will be
repeated for this new frame, but ﬁrst a propagation phase is
executed, where learned clauses are pushed forward in the
trace:
for k ∈ [1, N − 1] and c ∈ Rk do
if c holds in frame k + 1, add it to Rk+1
During the propagation phase it is important to do syntactic
subsumption. If a clause c was moved forward from frame k
to k + 1, and frame k + 1 has a weaker clause d ⊇ c, then
d should be removed. Subsumed clauses accumulate quickly,
but serves no purpose except to slow down the SAT-solver.
(1) QUEUE OF PROOF-OBLIGATIONS: Section III-B1 suggests
a recursive clause-learning scheme. However, PDR can
be improved by reusing proof-obligations of one time-frame in
all future time-frames. After all, if a cube is bad, it should be
blocked everywhere. This requires a queue, as the algorithm
now can have many outstanding proof-obligations in each
frame. The elements should be dequeued from the smallest
time-frame ﬁrst. This change has the added beneﬁt of making
PDR capable of ﬁnding counterexamples longer than the trace.
(2) TERMINATATION: PDR can terminate in one of two
ways: either (i) a proof-obligation at frame 0 intersects with
the initial states, which implies that the property fails (in this
case, a counterexample can be extracted with some additional
bookkeeping); or (ii) the clause sets of two adjacent frames
become syntactically identical: Ri ≡ Ri+1. Since Ri → P
by (5); Ri ∧ T → R′
i+1 by (4); I → Ri by (1) and (3a); then
Ri is an inductive invariant that proves the property.
FMCAD 2011, Page 127
Figure 2. BMC unrolling of length N. The design is reset in the ﬁrst
frame and Bad is asserted in the last frame. The last frame is drawn
partially because the next-state logic for the ﬂops is not needed.
D. Convergence
Must the main loop terminate for some ﬁnite trace length?
When generated, each proof-obligation (s, k) contains at least
one state that is not previously blocked. If the proof-obligation
is immediately handled, or if the generalization procedure
ﬁrst checks that this still holds when the proof-obligation is
dequeued, then every clause created by the learning algorithm
must block at least one more state of frame k. Because there
is a ﬁnite number of frames, and a ﬁnite number of states in
the FSM, the main loop is guaranteed to terminate.
Can the length of the trace grow indeﬁnitely? If the syntactic
termination check (Ri ≡ Ri+1) were done semantically
instead (Ri = Ri+1), then clearly this cannot happen. Ri+1
would have to block at least one state less than Ri. Suppose
therefore Ri = Ri+1 but Ri ≡ Ri+1. During the propagation
phase, all clauses of Ri will be moved into Ri+1, making
them syntactically identical and the algorithm terminates.
We note that the bound (2|S|
frames with at most 2|S|
clauses in each) implied by the above argument is very large,
and does in no way explain why the algorithm performs well
in practice.
E. What makes PDR so effective?
The experimental analysis of Section VI shows that PDR represents
a major performance improvement over interpolation
based model checking (IMC) [7], hitherto regarded as the
strongest bit-level engine. Why is this?
Consider the BMC unrolling depicted in Figure 2. Assume
for simplicity that the design can non-deterministically return
to the initial states at any time.2
This guarantees that the set of
reachable states grows monotonically with the frame number.
The ﬁrst version of IMC, never published,3
considered such
an unrolling, and from an UNSAT proof computed interpolants
Φi between every adjacent time-frames. This sequence of
interpolants has the property:
(1) I = Φ0
(2) Φk ∧ T → Φ′
k+1
(3) ΦN → P
(4) symbols(Φi) ⊆ state-variables
If N is chosen large enough, one of the interpolants Φk must
be an inductive invariant proving the property: if the sufﬁx
after Φk is longer than the backward diameter of the system,
it cannot contain any state that can reach Bad; the preﬁx
2This behavior can be achieved by rewiring the ﬂops, or, alternatively, be
made part of the veriﬁcation algorithm.
3Private conversation with Ken McMillan.
before Φk grows monotonically and for a ﬁnite system must
eventually repeat itself.
This method is not as effective as the published version of
IMC. So what is wrong with it? One can argue that the important
feature of interpolation is its generalizing capability.4
For
instance, the interpolant Φ1 can be viewed as an abstraction
of the ﬁrst time-frame, containing just the facts needed for the
sufﬁx to be unsatisﬁable (this interpretation is particularly in
accord with McMillan’s asymmetric interpolant computation).
Even though logically (2) implies that each interpolant can
be derived from its predecessor, this is not how the SAT-solver
constructs them. During its search, the solver is free to roam
all over the unrolling. We argue that this may deteriorate the
generalizing capability of interpolation.
In the published algorithm, McMillan used the insight that
(a) interpolants are smaller and more general toward the ends
of the unrolling, and (b) repeatedly applying interpolation on
its own output will improve the generalizing capability. In his
algorithm, the interpolant Φ1 is therefore repeatedly used to
replace the initial states constraint, resulting in interpolants
that are less and less dependent on the initial states and in
an increasingly more general way imply the unsatisﬁability of
the sufﬁx.
In a way, the procedure can be viewed as committing to
the abstraction that was computed. It disallows the SAT-solver
from going back to the real initial states and learning more
facts. For this to work, the sufﬁx must be long enough to
prevent any state that can reach the bad states from entering
into the interpolants. If it fails to prevent this, the algorithm
has to start over from scratch, typically with a longer unrolling
(although randomizing the SAT-solver and restarting with the
same length works sometimes).
We now compare this to how PDR works. First note that
at the end of each major cycle, just before pushing clauses
forward, the Ri are in fact interpolants; all the facts in frame
k and future frames are derived from Rk.
During the computations, PDR completely commits to its
current abstractions Ri. The localized reasoning prevents it
from learning new facts from earlier time-frames unless it
has been proved that new facts must be learned. In a way,
the whole procedure can be viewed as one big SAT-solving
process, where the solver is carefully controlled to make sure
it does not roam all over the unrolling. Further, when new facts
are brought in from previous frames, a lot of effort is spent on
simplifying those facts (the literal removing consumes ∼80%
of the runtime). There is no similar mechanism in IMC, it
must use whatever proof the SAT-solver happened to give it.
Also, PDR constantly removes subsumed clauses, especially
during the forward-propagation phase.
To summarize, PDR sticks to the facts it has learned as long
as possible, similar to the way IMC commits to its interpolants.
If what PDR has learned at a frame is too weak, it can repair
the situation by learning new clauses rather than scrapping all
4Indeed, interpolation based model checking is probably better understood
as a method for “guessing” an inductive invariant rather than, as often done,
an approximate reachability analysis.
FMCAD 2011, Page 128
the work done so far and starting over, as IMC does. PDR
has a very targeted approach to producing small facts by its
literal removing scheme, and it constantly weeds out redundant
clauses by subsumption checking and forward propagation.
A possible drawback of PDR, however, is the strong inductive
bias of its learning: it can only learn clauses in
terms of state variables. But this bias is also the very reason
it can efﬁciently do generalization. It might be that future
improvements to the algorithm will allow it to work efﬁciently
on a different domain.
IV. IMPLEMENTATION
This section details our implementation of PDR. In the pseudocode,
only cubes are used and not clauses. In particular we
represent the trace as sets of blocked cubes rather than learned
clauses. Furthermore, we only store a cube in the last timeframe
where it holds (to avoid duplication). We call this deltaencoded
trace F, and it relates to R through:
Rk =
i≥k
¬Fi
We also extend F by a special element F∞ which will hold
cubes that have been proved unreachable from the initial states
by any number of transitions. In the code, the following datatypes
are used:
– Vec. A dynamic vector with methods:
uint size() – returns size of the vector
T& op[](uint i) – returns the ith
element
void push(T elem) – pushes an element at the end
T pop() – pops and returns last element
– Cube. A ﬁxed-size vector of literals (no push/pop).
– TCube. A pair (cube ∈ Cube, frame ∈ uint) referred to
as a timed cube. Two special constants are deﬁned for
the frame component:
FRAME NULL – cube has no time component
FRAME INF – cube belongs in F∞
Function next(TCube s) returns s with the frame number
incremented by one.
An overview of the functions implementing PDR, and the
program state they work on is given in Figure 3. The FSM
is assumed to be given in circuit form, containing one safety
property to be proved. The special frame F∞ is stored as the
last element of the vector F.
An outline of the execution: Function pdrMain() gets a
bad state in the last frame and calls recBlockCube() to block
it, using the helper function isBlocked() (which checks if a
proof-obligation has already been solved) and generalize()
(which shortens a cube). When the property has been proved
for the last frame, propagateBlockedCubes() pushes cubes of
all time-frames forward while doing subsumption, handled by
addBlockedCube().
A. Separation of concerns
Our PDR implementation abstracts the handling of SAT calls
through the interface in Figure 4. The semantics of the
Program State:
Netlist N; – Netlist with property
Vec Vec Cube F; – Blocked cubes of each frame
PdrSat Z; – Supporting SAT solver(s)
Main Function:
bool pdrMain();
Recursive Cube Generation:
bool recBlockCube(TCube s0);
bool isBlocked(TCube s);
TCube generalize(TCube s0);
Cube Forward Propagation:
bool propagateBlockedCubes();
Small Helpers:
uint depth();
void newFrame();
bool condAssign(TCube& s, TCube t);
void addBlockedCube(TCube s);
Figure 3. Overview of PDR algorithm. “pdrMain()” will use
“recBlockCube()” to recursively block bad states of the ﬁnal time
frame until the property holds, then call “propagateBlockedCubes()”
to push blocked cubes from all frames in the trace forward to the
latest frame where they hold.
interface PdrSat {
Cube getBadCube();
bool isBlocked(TCube s);
bool isInitial(Cube c);
TCube solveRelative(TCube s, uint params = 0);
void blockCubeInSolver(TCube s);
};
Figure 4. Abstract interface for the SAT queries of PDR. These
methods can be implemented using either a monolithic SAT-solver,
or one SAT-solver per time-frame. The roles of “Init” and “Bad” can
be exchanged within this SAT abstraction to obtain the dual PDR
procedure based on backward induction (although ternary simulation
cannot be used backwards). The ﬁrst four functions corresponds to
actual SAT queries (although for some common restriction on initial
states, “isInitial()” can be implemented by a syntactic analysis).
The ﬁfth function, “blockCubeInSolver()”, merely informs the SAT
implementation that a new cube has been added to the vector “F”.
interface is deﬁned as follows:
Method getBadCube() returns a bad cube not yet blocked
in the last frame. Method isBlocked(s) returns TRUE if the
cube s.cube is blocked at s.frame. Method isInitial(c) returns
TRUE if the cube c intersects with the initial states. Method
blockCubeInSolver(s) reports to PdrSat that a cube has been
added to the vector F.
Finally, method solveRelative(s) tests if s.cube can be
blocked at frame s.frame using the extended query (Q2) of
Section III-B1. If the answer is UNSAT, then the implementation
returns a new cube z where:
z.cube ⊆ s.cube
z.frame ≥ s.frame
The method guarantees that not only is s.cube blocked at frame
s.frame, but that actually the subset z.cube is blocked at a later
frame. The SAT solver may learn these more general facts by
FMCAD 2011, Page 129
bool pdrMain() {
F.push(); – push “F∞”
newFrame(); – create ”F[0]”
Z = createPdrSat(N, F);
forever{
Cube c = Z.getBadCube();
if (c != CUBE NULL){
if (!recBlockCube(TCube(c, depth())))
– failed to block ’c’ ⇒ CEX found
return FALSE;
}else{
newFrame();
if (propagateBlockedCubes())
– invariant found, may store it here
return TRUE;
}
}
}
Figure 5. Main procedure. The last element of F (referred to as
“F∞”) contains all the cubes that have been proved to be unreachable
for all k. Their negation constitutes a proper inductive invariant.
Function “newFrame()” inserts a new frame into F just before F∞.
inspecting the ﬁnal conﬂict-clause of the solver (or the UNSAT
core), and taking this “free” information into account.
If instead the query is satisﬁable, then the implementation
returns a generalization, using ternary simulation, of a minterm
in the pre-image of s.cube. All states of the returned cube
z.cube can reach s.cube in one transition. The time component
z.frame is set to FRAME NULL.
The behavior of solveRelative() can be altered by the
params argument. Default value “0” means: do not extract
a model if the query satisﬁable, just return (CUBE NULL,
FRAME NULL). Parameter “EXTRACTMODEL” means:
work as described above. Parameter “NOIND” means: use the
original query (Q1) instead of (Q2).
V. SAT SOLVING
In this section, we discuss the details of implementing
solveRelative() of the PdrSat interface using MINISAT and a
single SAT instance. The other methods of the PdrSat interface
can be implemented in a similar way.
There are two features that are particularly important: (i)
MINISAT allows incremental SAT through assumption literals;
a set of unit clauses that are temporarily assumed during one
SAT call. After the call, the assumptions are undone and new
regular clauses can be added before the next call. (ii) For
UNSAT calls, MINISAT returns the subset of assumptions that
were used in the proof.
The netlist is transformed to CNF using the standard Tseitin
transformation [9] plus variable elimination [6]. Logic cones
are added to the solver on demand, starting with just the
transitive fanin of Bad. Whenever a new frame is added
to the trace, a new activation literal acti is reserved. All
clauses learned in that frame will be extended by ¬acti in
blockCubeInSolver().
Given a cube s = (s1 ∧ s2 ∧ . . . ∧ sn), procedure solveRelative()
does the following:
bool recBlockCube(TCube s0) {
PrioQ TCube Q; – orders cubes from low to high frames
Q.add(s0);
while (Q.size() > 0){
TCube s = Q.popMin();
if (s.frame == 0)
– Found counterexample, may extract it here
return FALSE;
if (!isBlocked(s)){
assert(!Z.isInitial(s.cube));
TCube z = Z.solveRelative(s, EXTRACTMODEL);
if (z.frame != FRAME NULL){
– Cube ’s’ was blocked by image of predecessor:
z = generalize(z);
while (z.frame < depth()−1
&& condAssign(z, Z.solveRelative(next(z))));
addBlockedCube(z);
if (s.frame < depth() && z.frame != FRAME INF)
Q.add(next(s));
}else{
– Cube ’s’ was not blocked by image of predecessor:
z.frame = s.frame − 1;
Q.add(z);
Q.add(s);
}
}
}
return TRUE;
}
Figure 6. Recursively block a cube. The priority queue “Q” stores all
pending proof-obligations: a cube and a time frame where it should
be blocked. In a practical implementation, it may also store the proofobligation
from which the element was generated (this facilitates
extraction of counterexamples). We noticed (or think we noticed)
a small performance gain by giving “PrioQ” a stack-like behavior
for proof-obligations of the same frame. We left one of our program
assertions in the pseudo code because this invariant is important and
non-obvious. Finally, note the line “Q.add(next(s))” line (just above
the “else”). Adding the current proof-obligation in the next frame
is not necessary, but it improves performance for UNSAT problems
and allows PDR to ﬁnd counterexamples longer than the length of
the trace—sometimes much longer.
(1) Reserve a new activation literal a and add the clause
{¬a, ¬s1, ¬s2, . . . , ¬sn} (unless NOIND is given).
(2) Call the solve method with the following assumptions:
[a, actk, actk+1, ..., actN+1, s′
1, s′
2, . . ., s′
n], where
s′
i denotes a ﬂop input.
(2u) If UNSAT:
– Remove all literals of s whose corresponding assumption
s′
i was not used, unless doing so makes the new
cube overlap with the initial states.
– Find the lowest acti that was used. Return the timed
cube (snew, i + 1).
(2s) If SAT and EXTRACTMODEL is speciﬁed:
– Extract a minterm m from the satisfying assignment.
– Shorten m to cube by ternary simulation. Return
(mnew, FRAME NULL).
FMCAD 2011, Page 130
bool isBlocked(TCube s) {
– Check syntactic subsumption (faster than SAT):
for (uint d = s.frame; d < F.size(); d++)
for (uint i = 0; i < F[d].size(); i++)
if (subsumes(F[d][i], s.cube))
return TRUE;
– Semantic subsumption thru SAT:
return Z.isBlocked(s);
}
TCube generalize(TCube s) {
for all literals p ∈ s {
TCube t = “s minus literal p”
if (!Z.isInitial(t.cube))
condAssign(s, Z.solveRelative(t));
}
return s;
}
Figure 7. Helper functions for recursive cube blocking. Function
“isBlocked()” semantically checks if s is already blocked, which
could have happened after the proof-obligation was enqueued. For
efﬁciency reasons, it ﬁrst does a syntactic check. This check is so
effective that we did not notice any performance loss by disabling the
semantic SAT check at the end (but we kept it to ensure convergence,
as argued in Section III-D). In fact, deriving a new cube from s, even
if s is blocked, may be a good idea, as the new cube can subsume
several old cubes. We note that function “generalize()” iterates over
s while s is being modiﬁed, which the implementation must handle.
(2s’) else if SAT, return (CUBE NULL, FRAME NULL).
(3) Add unit clause {¬a} permanently.
The last step (3) forever disables the temporary clause added
in (1). The periodic cleanup of MINISAT will reclaim the
memory. However, the variable index reserved for the activation
literal cannot be reused. For that reason we recycle the
solver when more then 50% of the variables currently in use
are disabled activation literals. This has the added beneﬁt of
cleaning up cones of logic that may no longer be in use. We
note that the previous activation literal can be reused if s is a
subset of the cube of the previous call, which happens quite
frequently.
VI. EXPERIMENTAL ANALYSIS
A number of experiments have been performed to evaluate
our PDR implementation, both on public benchmarks from the
Hardware Model Checking Competition of 2010 (HWMCC10)
and on industrial benchmarks.5
This section summarizes the
most interesting results we have found.
A. Comparison of IC3 and PDR
This experiment was performed using 274 hard problems
from our industrial collaborators. We simpliﬁed the designs
by running the ABC command “dprove”(see Figure 4.1 of
[8]). With a timeout of 10 minutes, 42 problems were solved
by either IC3 or PDR; included in Table I. From the table
we see that our implementation solves almost twice as many
5Although we cannot distribute the industrial benchmarks, we will make
our implementation of PDR available at http://bvsrc.org
uint depth() { return F.size() − 2; }
void newFrame() {
– Add frame to ’F’ while moving ’F∞’ forward:
uint n = F.size();
F.push();
F[n−1].moveTo(F[n]);
}
bool condAssign(TCube& s, TCube t) {
if (t.frame != FRAME NULL){
s = t;
return TRUE;
}else
return FALSE;
}
void addBlockedCube(TCube s) {
uint k = min(s.frame, depth() + 1);
– Remove subsumed clauses:
for (uint d = 1; d ≤ k; d++){
for (uint i = 0; i < F[d].size();){
if (subsumes(s.cube, F[d][i])){
F[d][i] = F[d].last();
F[d].pop();
}else
i++;
}
}
– Store clause:
F[k].push(s.cube);
Z.blockCubeInSolver(s);
}
Figure 8. Small helper functions. Function “addBlockedCube()” will
add a cube both to Fa nd the PdrSat object. It will also remove
any subsumed cube in F. Subsumed cubes in the SAT-solver will be
removed through periodical recycling.
instances as the original IC3 (38 vs. 21), but there are also 4
instances where IC3 solves them and our PDR does not. The
last column shows for comparison the results of interpolation
based model checking (IMC).
Figure 10 shows the behavior of the implementations for
increasing timeout limits. For space reasons we included two
more PDR runs discussed in the next section.
Figure 11 shows the performance of PDR, IC3 and IMC
on the HWMCC10 benchmarks. Looking closer at PDR vs.
IMC reveals that the difference is mostly on UNSAT problems,
where PDR solves 420 vs. 362 for IMC (14% difference). On
satisﬁable instance, numbers are 303 vs. 294 (3% difference).
B. Ternary simulation and Generalization
The third column of Table I, and the corresponding curve
in Figure 10, show the performance of PDR without ternary
simulation. It is clear that ternary simulation has a big impact.
Without it, our implementation drops way below IC3. One
reason for this may be that IC3 never had ternary simulation,
and Bradley implemented some other tricks that compensates
FMCAD 2011, Page 131
bool propagateBlockedCubes() {
for (uint k = 1; k < depth(); k++){
for all cubes c ∈ F[k] {
TCube s = Z.solveRelative(TCube(c, k+1), NOIND);
if (s.frame != FRAME NULL)
addBlockedCube(s);
}
if (F[k].size() == 0) return TRUE; – Invariant found
}
return FALSE;
}
Figure 9. Propagating blocked cubes forward. All cubes in F are
revisited to see if they now hold at a later time-frame. If so, they are
inserted into that frame. The subsumption of “addBlockedCube()”
will remove the cube from its current frame (and possible other
cubes in the later frame). Note that in a practical implementation, the
iteration over cubes in Fk must be aware of these updates. Because
c is already present in frame k, we can use (Q1) instead of (Q2) in
the call to solveRelative().
0
5
10
15
20
25
30
35
40
0 100 200 300 400 500 600
solvedinstances
timeout seconds
PDR
PDR with stronger generalization
IC3
PDR without ternary simulation
Figure 10. Comparison of IC3 and PDR on industrial problems. Two
modiﬁcations to PDR are also evaluated.
for this loss, notably removing multiple literals per SAT call
in the cube generalization.
The fourth column shows the effect of stronger cube generalization,
as proposed in the paper on IC3. The modiﬁed
procedure will try to remove a literal even if the SAT query is
satisﬁable by exploiting the non-monotonicity. As in IC3, this
is done for three random literals. Our conclusion from looking
at the results is that this technique was not helpful. Although
we do not have room to present the data here, the result holds
for the HWMCC10 benchmarks as well.
C. Effect of changing the semantics of RN
As pointed out in Section III-C, we diverge from IC3 by not
requiring the last frame of the trace to fulﬁll the property. The
approach of IC3 has two effects compared to ours:
(1) When a new frame is opened, the property is known
to hold, so P can be added to the relative induction SAT
query. This means that the ﬁnal invariant will be of the
form Ri ∧ P rather than just Ri, and that the clauses in
R∞ may depend on P.
Benchmark IC3 PDR NoSim StGen IMC
design01 prop1 – – – – 249.5
design01 prop2 4.1 0.3 102.5 0.4 0.2
design01 prop3 – 81.2 – 126.9 –
design01 prop4 – 70.0 – 191.0 –
design01 prop5 – 91.6 – 166.5 –
design01 prop6 – 100.7 – 176.7 –
design01 prop7 – – – – 168.8
design01 prop8 160.1 6.1 – 11.1 21.9
design01 prop9 130.1 5.9 – 10.7 42.8
design01 prop10 71.9 7.1 – 12.3 44.2
design02 prop1 594.0 30.2 – 144.0 –
design02 prop2 – 489.2 – – –
design02 prop3 – 68.0 – – –
design03 prop1 – 466.4 – 129.8 –
design03 prop2 – 483.3 – 130.8 –
design04 84.5 – – – –
design05 prop1 – 172.5 – 152.5 –
design05 prop2 – 182.1 – 172.0 –
design06 prop1 2.7 0.8 1.8 1.0 –
design06 prop2 3.1 3.1 5.6 0.8 –
design07 94.4 6.0 88.6 13.8 –
design08 298.3 83.6 – 133.1 –
design09 – 77.8 – 151.2 –
design10 prop1 2.0 1.0 2.3 1.4 –
design10 prop2 2.6 1.0 2.7 2.7 –
design11 prop1 324.4 28.1 474.0 27.4 –
design11 prop2 7.7 2.1 8.9 3.3 –
design12 – – – – 62.6
design13 prop1 – 126.1 – 85.0 –
design13 prop2 – 47.2 – 57.2 –
design13 prop3 – 26.0 – 22.2 –
design13 prop4 – 17.6 – 22.1 –
design13 prop5 – 18.1 – 26.4 –
design14 prop1 41.7 – – – –
design14 prop2 61.5 – – – –
design15 prop1 – 5.3 – 20.7 4.7
design15 prop2 – 32.8 – 10.9 595.8
design16 2.2 0.9 2.6 2.2 286.6
design17 – – – 185.8
design18 10.8 0.7 4.9 1.4 409.7
design19 501.7 13.4 – 23.3 –
design20 17.1 10.0 138.0 20.4 –
design21 169.9 – – 225.4 –
design22 – 154.1 – 185.0 –
design23 prop1 – 438.7 – – –
design23 prop2 – 320.5 – 133.0 –
Total solved 21 38 11 37 11
Table I. Comparison of IC3 and PDR on industrial problems. Two
modiﬁcations to PDR are also evaluated (disabling ternary simulation
“NoSim”, and stronger cube generalization “StGen”). Interpolation
(IMC) is also included for comparison. All benchmarks are UNSAT
except for design02 (3 properties) and design14 (2 properties).
Boldfaced ﬁgures indicates winner between IC3 and PDR only.
(2) Seeding the recursive cube-blocking with minterms
of the pre-image of P rather than with minterms of P
corresponds to a one-step target-enlargement.
The second difference, target-enlargement, can be implemented
by preprocessing the design (unroll the property cone
for one frame and combine the new and the old property
outputs). Figure 12 shows the effect it has on the HWMCC10
benchmarks. Note that it improves the performance for simple
satisﬁable problems solved in less than 100 seconds. The
difference is substantial enough to motivate the use of target-
enlargement.
FMCAD 2011, Page 132
500
550
600
650
700
750
0 100 200 300 400 500 600
solvedinstances
timeout seconds
PDR
IC3
interpolation
Figure 11. Comparison of IC3 and PDR on HWMCC10 problems.
150
200
250
300
350
400
450
0 100 200 300 400 500 600
solvedinstances
timeout seconds
1 step target enlargement (UNSAT)
No target enlargement (UNSAT)
1 step target enlargement (SAT)
No target enlargement (SAT)
Figure 12. Effect of target enlargement.
We also investigated if the ﬁrst difference above had any
effect by running our previous PDR implementation which
had the same behavior as IC3 in this respect (but includes
ternary simulation and other improvements). For space reason
we do not include the graph here, but the curves of the new
implementation (with target enlargement) and the previous
implementation match exactly on both SAT and UNSAT
problems. We also tried target enlargement of 2 steps, but
there was no additional beneﬁt.
We conclude that there is no performance loss due to our
modiﬁcation of the original algorithm. It makes the implementation
simpler, and it has the extra beneﬁt that R∞ is a proper
invariant, which can be used to strengthen other proof-engines
running in parallel, or be useful for synthesis.
D. Runtime breakdown
In order to identify directions for future improvements, we ran
an instrumented version of our PDR on a handful of examples.
Our ﬁndings suggests that about 20% of the runtime is spent
in propagateBlockedCubes() and 80% in recBlockCube()—
most of which is in generalize(), but a substantial portion also
in the ﬁrst call to solveRelative(). Satisﬁable calls to the SATsolver
are about twice as common as unsatisﬁable ones, and
5x more expensive.
E. Other things we tried
— We evaluated the effect of the extended query (Q2) vs.
the original (Q1) (Section III-B1). Although the (Q2) gave a
clear performance boost, PDR works remarkably well even
550
600
650
700
750
0 100 200 300 400 500 600
solvedinstances
timeout seconds
Semantic COI
With activity
With reverse activity
No Propagation + Sem. COI
Figure 13. Refuting activity / Semantic cone-of-inﬂuence.
without it (it solved 704 instead of 723 problems; more than
interpolation, which solved 656 problems).
— We evaluated the proposed activity scheme of IC3, which
controls the order in which literals are tried for removal. We
ran it against itself with the activity reversed (“worst” order)
and could see no difference (Figure 13), and no difference to
a static order either (not in the graph).
— We implemented a technique we call semantic cone-ofinﬂuence.
At the end of each major round, all cubes in the
trace that is not needed to prove the property of the ﬁnal frame
are removed. This analysis can be done through a series of
SAT calls of roughly the same cost as forward-propagation.
The method removes many cubes. However, running PDR
with this turned on did not give any noticeable speedup, but
it also did not degrade performance (thus the cost of doing
semantic COI was amortized by the improvement). But a really
interesting result is that running semantic COI, while turning
off forward-propagation, works almost as well as the standard
version of PDR (Figure 13). In contrast, turning off forwardpropagation
without semantic COI is a disaster! This shows
that an important feature of the forward-propagation is the
cleansing effect it has through the subsumption mechanism.
— Because most of the time is spent in satisﬁable SAT
calls, and this partly is a result of MINISAT always returning
complete models, we made a modiﬁed version of MINISAT
that only does BCP in the cone-of-inﬂuence of the ﬂops in
the query. With this version, a few more benchmarks (728
instead of 723) were solved. However, we think a justiﬁcation
based variable order should do even better. We are currently
working on a circuit based SAT-solver with this feature.
We have also implemented a non-monolithic version of PDR
(one solver instance per time-frame) that helps to localizing
the SAT solving better, especially together with frequent solver
recycling. For large benchmarks, where the relevant logic is
small compared to the size of the design, this version does
very well. It is worth noting that most of the work in PDR
takes place in the last couple of time-frames where the COI
is the smallest. In a monolithic PDR, early time-frames may
pollute these calls.
— We made a version that ﬁnds an inductive subset of
RN after propagating the cubes forward. This will ﬁnd true
FMCAD 2011, Page 133
inductive invariants that can be put into F∞. Although the
cost of this procedure did not quite amortize over the gains,
having more clauses in F∞ can be useful if those facts are
exported to other engines.
— We made an extension that allows PDR to develop and use
an abstraction, where some ﬂops are considered as primary
inputs. This is relatively straight-forward to implement. The
only tricky part in using localization abstraction is when it is
combined with proof-based abstraction [5], which can shrink
the current abstraction in the middle of PDR’s operations. The
reason is that the assertion in Figure 6 will not hold if we
apply a smaller abstraction to the initial states. The way to
address this is to introduce a reset signal that gives the correct
value at the ﬂop outputs of frame 0, and then let all ﬂops be
uninitialized.
REFERENCES
[1] A. R. Bradley. SAT-based model checking without unrolling.
In Proc. VMCAI, 2011.
[2] A. R. Bradley and Z. Manna. Checking safety by inductive
generalization of counterexamples to induction. In Proc.
FMCAD, 2007.
[3] G. Cabodi, L. A. Garcia, M. Murciano, S. Nocco, and S. Quer.
Partitioning interpolant-based veriﬁcation for effective unbounded
model checking. In IEEE TCAD, 2010.
[4] G. Cabodi, M. Murciano, S. Nocco, and S. Quer. Boosting interpolation
with dynamic localized abstraction and redundancy
removal. In ACM TODAES, 2008.
[5] N. Een, A. Mishchenko, and N. Amla. A Single-Instance
Incremental SAT Formulation of Proof- and CounterexampleBased
Abstraction. In FMCAD, 2010.
[6] Niklas Een and Armin Biere. Effective Preprocessing in SAT
through Variable and Clause Elimination. In SAT, 2005.
[7] K. L. McMillan. Interpolation and SAT-based Model Checking.
In CAV, 2003.
[8] A. Mishchenko, M. L. Case, R. K Brayton, and S. Jang. Scalable
and scalably-veriﬁable sequential synthesis. In Proc. ICCAD,
2008.
[9] G. Tseitin. On the complexity of derivation in propositional
calculus. Studies in Constr. Math. and Math. Logic, 1968.
FMCAD 2011, Page 134