Cyclic Proofs of Program Termination in Separation Logic James Brotherston Imperial College, London, UK J.Brotherston@imperial.ac.uk Richard Bornat Middlesex University, London, UK R.Bornat@mdx.ac.uk Cristiano Calcagno Imperial College, London, UK ccris@doc.ic.ac.uk Abstract We propose a novel approach to proving the termination of heapmanipulating programs, which combines separation logic with cyclic proof within a Hoare-style proof system. Judgements in this system express (guaranteed) termination of the program when started from a given line in the program and in a state satisfying a given precondition, which is expressed as a formula of separation logic. The proof rules of our system are of two types: logical rules that operate on preconditions; and symbolic execution rules that capture the effect of executing program commands. Our logical preconditions employ inductively defined predicates to describe heap properties, and proofs in our system are cyclic proofs: cyclic derivations in which some inductive predicate is unfolded infinitely often along every infinite path, thus allowing us to discard all infinite paths in the proof by an infinite descent argument. Moreover, the use of this soundness condition enables us to avoid the explicit construction and use of ranking functions for termination. We also give a completeness result for our system, which is relative in that it relies upon completeness of a proof system for logical implications in separation logic. We give examples illustrating our approach, including one example for which the corresponding ranking function is non-obvious: termination of the classical algorithm for in-place reversal of a (possibly cyclic) linked list. Categories and Subject Descriptors F.3.1 [Logics and Meanings of Programs]: Specifying and Verifying and Reasoning about Programs--Logics of programs General Terms Verification, theory, reliability Keywords separation logic, termination, cyclic proof, program verification, inductive definitions, Hoare logic 1. Introduction Termination is an essential requirement of many computer programs yet, as every undergraduate computing student learns, deciding whether a particular program terminates on a given input is, in general, impossible. Thus research into proving program termination must focus on the development of appropriate frameworks and Supported by EPSRC. The authors also gratefully acknowledge Josh Berdine, Peter O'Hearn, Alex Simpson and Hongseok Yang for enlightening discussions, and the anonymous referees for their helpful comments and suggestions. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. POPL'08, January 7­12, 2008, San Francisco, California, USA. Copyright c 2008 ACM 978-1-59593-689-9/08/0001...$5.00 proof search heuristics for termination proofs for various classes of program. In this paper, we give a proof system, based upon separation logic and employing the method of cyclic proof, that is tailored towards proving termination of simple imperative programs. Many important computer programs in use today are written in low-level imperative languages, and the ability to establish termination of such programs is thus clearly desirable. However, imperative programs have often proven resistant to formal analysis, at least partially due to the difficulty of reasoning about their use of pointer arithmetic and similar operations that operate on data stored in shared structures, such as the heap. Recently, separation logic was developed to help overcome this impasse by providing logical tools to reason about shared, mutable resource (Reynolds 2002). As well as the usual additive connectives of first-order logic, separation logic employs multiplicative connectives to express properties involving heap resource. The multiplicative conjunction denotes a division of the heap into two parts in which each conjunct holds respectively, and the multiplicative implication -- expresses a property of resource addition: if an arbitrary heap satisfies the antecedent formula and is disjoint with the current heap, then the combined heap satisfies the consequent formula. Separation logic has successfully underpinned several program verification applications to date, including the Smallfoot automated tool (Berdine et al. 2006a), local shape analysis (Distefano et al. 2006; Calcagno et al. 2006), inductive recursion synthesis (Guo et al. 2007), total correctness proofs of non-trivial algorithms (Torp-Smith et al. 2004; Bornat et al. 2004) and, pertinently for our purposes, automated termination checking (Berdine et al. 2006b). Existing work on program verification in separation logic has relied on the identification and use of suitable inductive definitions to express shape properties of the heap (at some given point in the execution of the program). A recent paper by the first author (Brotherston 2007) developed a general framework for inductive definitions in O'Hearn and Pym's logic of bunched implications BI (O'Hearn and Pym 99), which underlies the pure-logical part of separation logic, and gave proof systems for the extension, including an appropriate notion of cyclic proof. For logics featuring inductive predicates or similar fixed-point constructions, cyclic proof provides an alternative to traditional inductive proof, modelled on Fermat's infinite descent (Brotherston 2006; Brotherston and Simpson 2007; Sprenger and Dam 2003). In the case of inductively defined relations, one way of stating this principle is: if in some case of a proof some inductive definition is unfolded infinitely often, then that case may be disregarded. Essentially, this principle is sound because each inductive definition has a least-fixed point interpretation which can be constructed as the union of a chain of approximations, indexed by ordinals; unfolding a definition infinitely often can thus be seen as inducing an infinite descending chain of these ordinals, which contradicts their well-foundedness. In cyclic proof systems, the capacity for unfolding a definition infinitely often is built in to the system by allowing proofs to be nonwell-founded, i.e. to contain infinite paths. Such proof structures 101 are, in general, unsound, so a global "well-formedness" condition is additionally imposed on proofs to ensure that every infinite path can be disregarded by the infinite descent principle outlined above. The portion of proof that is not disregarded by this principle is thus finite, and sound for standard reasons. Our main contribution in this paper is the formulation of a cyclic proof system tailored to proving termination of programs written in a simple, yet relatively expressive imperative programming language. Given some fixed program, the judgements of this system express guaranteed (non-faulting) termination of the program when started from a given line in the program and in a state satisfying a given precondition, which is expressed as a formula of separation logic. The proof rules of the system are of two types: logical rules that operate on the precondition, and symbolic execution rules that simulate program execution steps. Thus, a program execution path corresponds to a path in a derivation, interleaved with logical infer- ences. A program terminates just if no infinite computation is possible, i.e. if all potentially infinite computations can be dismissed as logically contradictory. An instance of this principle is the size change principle for program termination of Lee, Jones and BenAmran, which states that a program is terminating if every infinite computation induces an infinite descending sequence of values from a well-ordered set (Lee et al. 2001). Here, we employ the existing techniques of cyclic proof (Brotherston 2007, 2006, 2005; Sprenger and Dam 2003; Sch¨opp and Simpson 2002) to consider and then dismiss potentially infinite computations in the manner described above. A cyclic pre-proof in our system is formed from partial derivation trees by identifying every `bud' (a node to which no proof rule has yet been applied) with a syntactically identical interior node, so that pre-proofs can be immediately understood as cyclic graphs in which infinite program computations correspond to infinite proof paths. To ensure that all such computations can be disregarded, we demand that, to count as a bona fide cyclic proof, a pre-proof must satisfy a global trace condition which formalises the fact that some inductively defined predicate is unfolded infinitely often along every infinite path. This ensures the soundness of our cyclic proofs. Moreover, the use of this condition means that we do not need to construct explicit ranking functions for termination. Indeed, it appears possible to construct cyclic termination proofs for which it is difficult to construct or deploy a ranking function. The remainder of this paper is structured as follows. In Section 2 we give the syntax and small-step semantics of a simple imperative programming language, TOY-C, which is nonetheless sufficient to express many pointer-based algorithms. In Section 3, we give the syntax and semantics of our language for program preconditions, which is an extension of separation logic with a schema for inductive definitions as given in Brotherston (2007). In Section 4, we give the rules of our proof system for termination, and then proceed to formalise the notion of a cyclic proof in this system in Section 5, (along very similar lines to the notions of cyclic proof employed in Brotherston (2007, 2006)). We provide a soundness theorem and also a relative completeness theorem for our system, the latter result demonstrating that the question of provability of a valid judgement in our system can always be reduced to the question of provability of a valid logical judgement in pure separation logic. In Section 6, we give some sample termination proofs in our cyclic system, including one example -- the in-place reversal of a cyclic linked list -- for which the termination measure is far from obvious. Finally, in Section 7 we summarise and identify the directions for future work. 2. TOY-C: a small imperative programming language In this section we give the syntax and semantics of a toy programming language, TOY-C, that nevertheless contains the following essential features: conditional branching; assignment; dereferencing; and memory allocation/deallocation. This allows us to deal with manipulation of the heap, the traditional domain of separation logic, but the cyclic proof method applies to a far wider range of programs than we can consider here. We assume the existence of a denumerably infinite set Var of variables, and a first-order language exp, called the expression language, satisfying exp Var and containing a distinguished constant symbol nil. The expressions E of TOY-C are just the terms (defined as usual) of the expression language. The syntax of branching conditions Cond and atomic commands C is then given by the following grammar: Cond ::= E = E | E = E C ::= x := E | x := [E] | [E] := E | x := new() | free(E) | if Cond goto j | stop where x := [E] and [E] := F access and assign to the heap cell with address E. A program in TOY-C is a finite sequence of indexed commands 1 : C1; ; n : Cn (we say that Ci is the command at program point i). We write the command goto j as an abbreviation of if nil = nil goto j. We give the semantics of TOY-C programs using a basic RAM model. We fix a set Val of (program) values and a set Loc Val of (program) locations, and assume a distinguished "nullary" value nil Val-Loc. A stack in this model is a function s : Var Val, and a heap is a partial, finitely-defined function h : Loc fin Val; we write Stacks and Heaps for the set of all stacks and the set of all heaps respectively. We write s[x v] for the stack defined exactly as s except that (s[x v])(x) = v, and adopt a similar notation for heaps. We write to denote composition of heaps: if h1 and h2 are heaps with dom(h1) dom(h2) = , then h1 h2 is the composite heap defined by: (h1 h2)(l) = h1(l) if l dom(h1) h2(l) if l dom(h2) undefined otherwise The interpretation [[E]]s of an expression E in a stack s is standard: [[nil]]s = nil and [[x]]s = s(x) for any x Var (given some fixed interpretation for any function symbols in the expression language, s can then be extended to all expressions in the usual way). Similarly, for branching conditions, we have s [[E1 = E2]] iff [[E1]]s = [[E2]]s, and s [[E1 = E2]] iff [[E1]]s = [[E2]]s. A (program) state in our model is a triple (i, s, h), where i N represents the next line of the program to be executed (i.e. the value of the program counter), s is a stack and h is a heap. The small-step semantics of programs, presented in Figure 1, is given by a binary relation on program states, with the intended meaning that (i, s, h) (i , s , h ) holds if the execution of the next command in the state (i, s, h) can result in the new program state (i , s , h ). We write (i, s, h) iff there is no infinite -sequence beginning with (i, s, h) . . ., i.e. iff the program terminates when started in the state (i, s, h). When a command tries to access unallocated memory, the execution continues from the special location fault, and loops. This effectively equates memory errors and divergence. (Note also that the absence of a case for stop in the small-step semantics implies that our programs terminate immediately on encountering a stop command.) 102 Ci x := E (i, s, h) (i + 1, s[x [[E]]s], h) Ci x := [E] [[E]]s dom(h) (i, s, h) (i + 1, s[x h([[E]]s)], h) Ci [E] := E [[E]]s dom(h) (i, s, h) (i + 1, s, h[[[E]]s [[E ]]s]) Ci x := new() Loc \ dom(h) v Val (i, s, h) (i + 1, s[x ], h[ v]) Ci free(E) [[E]]s dom(h) (i, s, h) (i + 1, s, (h (dom(h) \ {[[E]]s})) Ci if Cond goto j s [[Cond]] (i, s, h) (j, s, h) Ci if Cond goto j s [[Cond]] (i, s, h) (i + 1, s, h) (fault, s, h) (fault, s, h) Ci x := [E] | [E] := E | free(E) [[E]]s / dom(h) (i, s, h) (fault, s, h) Figure 1. Small-step operational semantics of TOY-C, given by the binary relation over (N × Stacks × Heaps). Example 2.1 (List traversal program). Consider the C-like program for traversing a linked list:1 while (x!=nil) x:=[x]; The equivalent TOY-C program can be written as follows: 1: if x = nil goto 4; 2: x := [x]; 3: goto 1; 4: stop Clearly this program terminates if the heap consists of an acyclic, singly-linked list whose first node is pointed to by x and whose last node contains the null pointer nil (because the length of the list remaining to be traversed clearly decreases at each iteration of the loop, and is initially finite). We give a formal proof in Section 6. Our language has the termination monotonicity property (see Yang and O'Hearn (2002)). If a program terminates in a heap h then it terminates in every extension of h. Proposition 2.2 (Termination monotonicity). If (i, s, h) and hh is defined then (i, s, h h ). Proof. (Sketch) By contraposition, it suffices to show that the existence of an infinite -sequence starting from (i, s, h h ) implies the existence of an infinite -sequence starting from (i, s, h). This follows from the following claim: Claim. If (i1, s1, h1 h ) (i2, s2, h2) then either: 1. h2 = h h for some h , and (i1, s1, h1) (i2, s2, h ), or; 2. (i1, s1, h1) (fault, s1, h1). Then given any infinite -sequence starting from (i, s, h h ), it is either the case that no step in this sequence accesses h , in which case every step in this sequence can be equally well carried out starting from (i, s, h), or that some step in the sequence accesses h , in which case the equivalent step in the sequence starting from (i, s, h) causes a fault and thus this sequence immediately diverges. In either case we have the required infinite -sequence starting from (i, s, h). It only remains to substantiate the claim, which follows from a straightforward case analysis on (i1, s1, h1 h ) (i2, s2, h2). 3. Separation logic with inductive definitions In this section, we reprise the syntax and semantics of separation logic extended with a framework for inductive definitions, as given in Brotherston (2007). This gives us a framework for expressing heap properties using customised inductive definitions, which will 1 For simplicity, in this and other examples we treat lists which contain nothing but pointers to successor cells. be useful later in the expression of suitable preconditions for our TOY-C programs. We assume a fixed first-order language , and for reasons of expressivity stipulate that exp, i.e., that our logical language extends the TOY-C expression language. The terms of are defined as usual, with variables drawn from the set Var. We write t(x1, . . . , xk) for a term t all of whose variables occur in {x1, . . . , xk}, and we often use vector notation to abbreviate sequences, e.g. x for (x1, . . . , xk). The interpretation [[t]]s of a term t of in a stack s is then defined as for expressions (provided we have given an interpretation for any constant or function symbol that is not in exp 2 ). In contrast to the usual situation in first-order logic, but in keeping with prior developments for first-order logic with inductive definitions (Brotherston 2006), we designate finitely many of the predicate symbols of as special inductive symbols; a predicate symbol that is not inductive is called ordinary. For each predicate symbol Q of arity k (say) we assign an intepretation [[Q]] Pow(Heaps × Valk ). We insist that contains a 0-ary ordinary predicate symbol emp and a binary ordinary predicate symbol , whose interpretations are given respectively by: [[emp]] = {h | dom(h) = } [[]] = {(h, v1, v2) | dom(h) = {v1} and h(v1) = v2} Thus the predicate emp denotes the empty heap, while v1 v2 denotes those heaps having exactly one location, which is denoted by v1 and whose contents are denoted by v2. Our formulas are the usual formulas of predicate BI3 , given by the following grammar: F ::= | | emp | Q(t1, . . . , tk) (k = arity of Q) | t1 = t2 | F F | F F | F F | F F | F -- F | xF | xF where t1, . . . , tk range over the terms of and Q ranges over the predicate symbols (both ordinary and inductive) of . We use the standard precedences on the logical connectives, with and -- having the same logical precedence as and respectively, and use parentheses to disambiguate where necessary. Also, we write F as an abbreviation of the formula F . We may then define the standard satisfaction relation s, h |= F, by induction on the structure of F, as shown in Figure 2. However, we shall further insist that the interpretation [[P]] of each inductive predicate symbol P coincides with the standard interpretation, which is fixed by a given inductive definition for P. Our inductive definition schema (essentially the one formulated in Brotherston (2007)) is given by the following definition: 2 Our extension of stacks to arbitrary logical terms is a technical simplification. We could instead use stack-extending environments to interpret terms. 3 However, note that here we write emp for the multiplicative unit I of BI. 103 s, h |= true s, h |= false s, h |= t1 = t2 [[t1]]s = [[t2]]s s, h |= Q(t1, . . . , tk) (h, [[t1]]s, . . . , [[tk]]s) [[Q]] (Q ordinary or inductive) s, h |= F1 F2 s, h |= F1 or s, h |= F2 s, h |= F1 F2 s, h |= F1 and s, h |= F2 s, h |= F1 F2 s, h |= F1 implies s, h |= F2 s, h |= F1 F2 h1, h2. h = h1 h2 and s, h1 |= F1 and s, h2 |= F2 s, h |= F1 -- F2 h . h h defined and s, h |= F1 implies s, h h |= F2 s, h |= xF v Val. s[x v], h |= F s, h |= xF v Val. s[x v], h |= F Figure 2. The relation s, h |= F for satisfaction of a separation logic formula F by a stack s and heap h. Definition 3.1 (Inductive definition). An inductive definition of an inductive predicate symbol P is a set of statements: C1(x1) P t1(x1), . . . , Ck(xk) P tk(xk) where k N and C1(x1), . . . , Ck(xk) are inductive clauses, given by the following grammar: C(x) ::= Pt(x) | ^F(x) | C(x) C(x) | C(x) C(x) | ^F(x) C(x) | ^F(x) -- C(x) | xC(x) where P ranges over the inductive predicate symbols of and ^F(x) ranges over all formulas in which no inductive predicates occur and whose free variables are contained in {x}. Each statement Ci(xi) Piti(xi) is read as a disjunctive clause of the definition of the inductive predicate symbol Pi. As in Brotherston (2007), our use of ^F(x) on the left of implications in inductive clauses is designed to ensure monotonicity of our inductive definitions. A more liberal definition scheme might allow inductively defined predicates to occur negatively subject to an appropriate stratification of inductive definitions, as in iterated inductive definitions (Martin-L¨of 1971). The standard interpretation of an inductive predicate symbol P is then, as usual, the least prefixed point of a monotone operator constructed from the inductive definitions. This operator, and the process for constructing its least fixed point, are essential to understanding the soundness of our cyclic proof system, so we include the details here: Definition 3.2 (Definition set operator). Let the inductive predicate symbols of be P1, . . . , Pn with arities a1, . . . , an respectively, and suppose we have a unique inductive definition for each predicate symbol Pi. Then, for each i {1, . . . , n}, from the inductive definition for Pi, say: C1(x1) Pi t1(x1), . . . , Ck(xk) Pi tk(xk) we obtain a corresponding n-ary function i : (Pow(Heaps × Vala1 ) × . . . × Pow(Heaps × Valan )) Pow(Heaps × Valai ) as follows: i(X) = 1jk{ (h, [[tj]](s[xj d])) | s[xj d], h |=[[P]]X Cj(xj)} where s is an arbitrary stack and |=[[P]]X is the satisfaction relation defined exactly as in Figure 2 except that [[Pi]] = n i (X) for each i {1, . . . , n}. Note that any variables occurring in the right hand side but not the left hand side of the set comprehension in the definition of i above are, implicitly, existentially quantified over the entire right hand side of the comprehension. Then the definition set operator for P1, . . . , Pn is the operator P, with domain and codomain Pow(Heaps×Vala1 )×. . .×Pow(Heaps×Valan ), defined by: P(X) = (1(X), . . . , n(X)) Example 3.3. Consider the following inductive definition for a binary inductive predicate symbol ls: emp ls x x x x ls x y ls x y Then [[ls]] is the least prefixed point of the following operator, with domain and codomain Pow(Heaps × Val2 ): ls(X) = {(emp, (v, v)) | v Val} {(h1 h2, (v, v )) | w Val. (h1, (v, w)) [[]] and (h2, (w, v )) X} The predicate ls denotes singly-linked list segments; ls x y is true of those heaps that represent a (possibly cyclic) linked list segment whose first element is pointed to by x and whose last element contains the pointer y. (If the list is acyclic, then y is a dangling pointer.) The operator generated from a set of inductive definitions by Definition 3.2 is monotone (Brotherston 2007), and consequently has a least (pre)fixed point, which gives the standard interpretation for the inductively defined predicates of the language. As is wellknown (see e.g. Aczel (1977)), this least prefixed point can be iteratively approached in ordinal-indexed stages or approximants: Definition 3.4 (Approximants). Let P be the definition set operator for the inductive predicates P1, . . . , Pn of as in Definition 3.2. Define a chain of ordinal-indexed sets ( P)0 by transfinite induction: P = < P( P) (note that this implies 0 P = (, . . . , )). Then for each i {1, . . . , n}, the set P i = n i ( P) is called the th approximant of Pi. Definition 3.5 (Standard interpretation). The function [[-]] is said to respect the standard interpretation of the inductive predicates P1, . . . , Pn of if for all i {1, . . . , n}, we have [[Pi]] = P i . From now on, we assume that [[-]] always respects the standard interpretation of any inductive predicate symbols. 4. Proof rules for termination judgements In this section we give the rules of a Hoare-style proof system for proving termination of programs written in TOY-C. We write termination judgements of the form i, where i is a program point and is a bunch (O'Hearn and Pym 99): Definition 4.1 (Bunch). A bunch is a tree whose leaves are formulas (as defined in Section 3) and whose internal nodes are `;' or `,' (which are logically equivalent to and , respectively). We write () to mean that is a bunch of which is a subtree (also called a "sub-bunch"), and write ( ) for the bunch obtained by replacing the considered instance of by in (). We observe that a bunch can be considered as a formula by replacing every occurrence of `;' by `' and every occurrence of `,' by `', and thus we extend our notion of satisfaction (cf. Fig. 2) to bunches in the obvious manner. Definition 4.2 (Coherent equivalence). Coherent equivalence, , is the smallest binary relation on bunches satisfying the following (commutative monoid) equations: 1; (2; 3) (1; 2); 3 1, (2, 3) (1, 2), 3 1; 2 2; 1 1, 2 2, 1 ; emp, 104 plus the rule of congruence: implies () ( ). Definition 4.3 (Validity). A termination judgement i is valid iff s, h |= implies (i, s, h) for all s Stacks and h Heaps. Our Hoare logic rules for termination judgements are given in Figure 3. The symbolic execution rules for commands are adaptations of standard rules for separation logic (Berdine et al. 2005). The convention is that primed variables x and x in the preconditions must be chosen fresh. The general rules are similar to rules in a proof system for implication. In places, our rules differ slightly from the typical presentation of Hoare logic in order to simplify the tracking of occurrences of inductive predicates, which is crucial for cyclic proofs. For example, our use of the cut rule in place of the usual rule of consequence in Hoare logic allows us to track inductive predicates in the "context part" of the rule. Our cut rule is presented with just one premise, with the usual second premise F instead given as a side condition (which can be seen as an analogue of the usual side condition, involving semantic entailment, on the rule of consequence). This separates our Hoare reasoning, and tracking of inductive predicates, from an external reasoning process for establishing pure implication (which itself might use cyclic proofs (Brotherston 2007) or an existing theorem prover such as Berdine et al. (2006a)). The multiplicative weakening rule (WkM) does not normally hold in separation logic but is valid in the case of termination, justified by Proposition 2.2. The case-split rule for inductive predicates will play a core role in our cyclic proof system, and deserves more explanation. Suppose that the inductive predicate P has definition: C1(x1) P t1(x1), . . . , Ck(xk) P tk(xk) Our case-split rule: ((t = tj(x); Cj(x)) i)1jk (Case P) x {x}. x FV ((Pt))(Pt) i unfolds an occurrence Pt of an inductive predicate according to its definition (with the surrounding context remaining unaffected). Example 4.4 (List segment). Let ls be the inductive predicate for (possibly cyclic) list segments defined in Example 3.3. The inductive definition of ls determines the following case-split rule: (t = u; emp) i (tx ls x u) i (Case ls) (ls t u) i Proposition 4.5. The Hoare logic rules given in Figure 3 are locally sound. That is to say, if all of the premises of any rule instance are valid, and any relevant side conditions are satisfied, then the conclusion of the rule instance is also valid. 5. Cyclic proofs of program termination We now define a notion of cyclic proof for our termination judgements. First we define cyclic pre-proofs: finite derivation trees together with a function that, for each leaf node in the tree to which no proof rule has been applied (called a bud), assigns an interior node in the tree that is labelled with an identical judgement; this interior node is called the companion of the bud. A cyclic pre-proof can thus be seen as a representation of a regular, infinite derivation tree by a cyclic graph. Definition 5.1 (Companion). Let B be a bud of a derivation tree D. An internal node C in D is said to be a companion for B if they have the same judgement labelling. By assigning a companion to each bud node in a finite derivation tree, one obtains a finite representation of an associated (regular) infinite tree: Definition 5.2 (Cyclic pre-proof). A cyclic pre-proof of a termination judgement F i is a pair P = (D, R), where D is a derivation tree constructed according to the Hoare logic rules for termination judgements given in Section 4 and whose root is labelled by F i, and R is a function assigning a companion to every bud of D. If P = (D, R) is a cyclic pre-proof, we write GP for the graph obtained from D by identifying each bud node B in D with its companion R(B). Definition 5.3 (Trace). Let P be a pre-proof and let (k ik )k0 be a path in GP . A trace following (k ik )k0 is a sequence (k)k0 such that, for all k, k is the position of a leaf Fk of k. Furthermore, for each k 0, one of the following conditions must hold: 1. (Fk is in the active part of the rule) k ik is the conclusion of one of the following inferences, k is the position of the underlined formula in the conclusion and k+1 is the position of one of the underlined formulae in the premise: (F1; F2) i () (F1 F2) i (F2) i F1 (--) (, F1 -- F2) i (F1, F2) i () (F1 F2) i (; F2) i F1 () (; F1 F2) i (F[t/x]) i () (xF) i (t = tj(x); Cj(x)) i . . . (Case P) (Pt) i (We remark that, due to the form of our inductive definitions (cf. Defn 3.1), we never need to trace formulas through the active part of the rules () or ().) In the case where (Case P) is applied with conclusion k ik as above, and k and k+1 are the positions of the leaves of k and k+1 indicated by the underlining, the trace is said to progress at k. An infinitely progressing trace is a trace that progresses at infinitely many points. 2. (Fk is not in the active part of the rule) k+1 is the position of the leaf in k+1 corresponding to k in k, modulo any splitting of k performed by the rule. (Thus Fk+1 = Fk , modulo any substitution performed by the rule.) E.g. if k ik is the conclusion of the inference: (F2) i F1 (--) (, F1 -- F2) i then k+1 and k are the same position in (-) (and thus Fk+1 = Fk ). Similarly, if k ik is the conclusion of the inference: x = t[x /x]; (E t, )[x /x] i+1 Ci x := [E] E t, i then k+1 and k are the same position in (-) (and thus Fk+1 = Fk [x /x]). As a final example, if k ik is the conclusion of the inference: i (Equiv) i then the position of k+1 in the rearranged bunch must respect the original position of k in in the obvious way (and thus Fk+1 = Fk ). 105 Symbolic execution rules: x = E[x /x]; [x /x] i+1 Ci x := E i Cond; j Cond; i+1 Ci if Cond goto j i x = t[x /x]; (E t, )[x /x] i+1 Ci x := [E] E t, i x x , [x /x] i+1 Ci x := new() i E0 E1, i+1 Ci [E0] := E1 E0 t, i i+1 Ci free(E) E t, i Ci stop i General rules: () i (WkA) (; ) i () i (WkM) (, ) i (; ) i (Contr) () i i (Equiv) i (F) i F (Cut) () i i x not a program variable (Subst) [t/x] i ( )[t2/x, t1/y] i (=) (t1 = t2)[t1/x, t2/y] i () i (F1) i (F2) i () (F1 F2) i (F1; F2) i () (F1 F2) i (; F2) i F1 () (; F1 F2) i (F[t/x]) i () (xF) i (F[z/x]) i z FV ((xF)) () (xF) i (F1, F2) i () (F1 F2) i (F2) i F1 (--) (, F1 -- F2) i Case-split rule: ((t = tj(x); Cj(x)) i)1jk C1(x1) P t1(x1), . . . , Ck(xk) P tk(xk) x {x}. x FV ((Pt)) (Case P) (Pt) i Figure 3. Hoare logic rules for termination judgements. Definition 5.4 (Proof). A pre-proof P is a proof if for every infinite path in GP , there is an infinitely progressing trace following some tail of . Theorem 5.5. If there is a proof of i then i is valid. An outline proof of Theorem 5.5 is given in Appendix A. Proposition 5.6. It is decidable whether a pre-proof is a proof. As well as soundness, our proof system enjoys the following relative completeness property: Theorem 5.7 (Relative Completeness). Under the assumption that the underlying proof system for ordinary implications F is complete, if i is valid then there is a proof of i. Intuitively, it is possible to define inductive predicates termi which ensure termination starting from program point i, and the proof that ensures termination at i can be reduced to an ordinary implication termi x. This situation is analogous to what happens in ordinary Hoare logic, where relative completeness proofs are typically expressed with respect to an oracle for implications. We give an outline proof of Theorem 5.7 in Appendix B. 6. Examples of cyclic termination proofs We show three examples, two straightforward and one more intricate. In each example we have treated goto as if it had a singlepremise rule of its own. Example 6.1 (Termination of list traversal). Consider the program from Example 2.1 with precondition ls x nil, with predicate ls as in Example 3.3. Figure 4 gives a proof of termination. We show the association of a suitable companion to the only bud in the pre-proof with a (red) arrow. Note that there is only one infinite path in the pre-proof ­ which goes around the cycle ­ and the associated trace (underlined) makes progress infinitely often at the case-split rule. Therefore we have termination. This proof uses multiplicative weakening (WkM), throwing away part of the heap, in this case the cells of the list that have already been traversed. This is a peculiar thing to do in separation logic, in which proofs are careful to account for all resource. In the next two examples we show that proofs which more carefully account are possible, but for termination of list traversal we really only need to know that the list yet to be traversed diminishes, and the proof of figure 4 is all that is required. Example 6.2 (Termination of in-place list reversal). The classical in-place reverse algorithm reverses a list in place, using two variables and no additional heap space: y := nil; while x = nil do z := x; x := [x]; [z] := y; y := z od 106 stop x = nil; ls x nil 4 () ; x = nil; emp 2 () x = nil; x = nil; emp 2 ls x nil 1 goto 1 ls x nil 3 (WkM) x x, ls x nil 3 (WkA) x = nil; (x x, ls x nil) 3 (=) x = x ; x = nil; (x x , ls x nil) 3 x := [x] x = nil; (x x , ls x nil) 2 () x = nil; x x ls x nil 2 (Case ls) x = nil; ls x nil 2 if x = nil goto 4 ls x nil 1 stop x = nil; ls x nil 4 () ; x = nil; emp 2 () x = nil; x = nil; emp 2 ls x nil 1 goto 1 ls x nil 3 (WkM) x x, ls x nil 3 (WkA) x = nil; (x x, ls x nil) 3 (=) x = x ; x = nil; (x x , ls x nil) 3 x := [x] x = nil; (x x , ls x nil) 3 () x = nil; x x ls x nil 2 (Case ls) x = nil; ls x nil 2 if x = nil goto 4 ls x nil 1 Figure 4. A termination proof of list traversal stop x = nil (ls y nil ls x nil) 8 z = x x = nil (ls y nil (x = nil emp)) 4 ls y nil ls x nil 2 goto 2 ls y nil ls x nil 7 cut y = z (ls y nil z y ls x nil) 7 y := z ls y nil z y ls x nil 6 [z] := y ls y nil z x ls x nil 5 = I x = x z = x (ls y nil x x ls x nil) 5 x := [x] z = x (ls y nil x x ls x nil) 4 WkA z = x x = nil (ls y nil x x ls x nil) 4 (Case ls) z = x x = nil (ls y nil ls x nil) 4 z := x x = nil (ls y nil ls x nil) 3 if x = nil goto 8 ls y nil ls x nil 2 Figure 5. Termination of in-place list reversal This proof could have used multiplicative weakening in place of cut, but we have chosen not to do so in order to emphasise the connection with the next example. Example 6.3 (Termination of in-place "frying-pan list" reversal). The previous example algorithm will reverse a cyclic list segment, but the loop measure, and hence the proof of termination, is tricky. A cyclic list segment ls x j in which the terminating pointer j points to a node already in the segment can be seen as a separated three-part structure of two acyclic list segments and a "join node", represented in separation logic as: k.(ls x j j k ls k j) (2) in which each of the disjuncts corresponds directly to one of the pictures in figure 6. This invariant is sufficiently obvious that it can be discovered automatically [13], but it's hard to see what formula to use as a measure for this process, because x plays different r^oles at different stages of the proof. A termination proof could be hacked up using auxiliary variables to record the join point and the stage of the proof, but the proof stages and the join point are proof artefacts, and it should be unnecessary to reveal them. Figure 7 shows a pre-proof of the judgement I 2, where I is the invariant (3) (with existential quantifications omitted for simplicity, and double-line steps indicating implicit additive weakenFigure 4. A termination proof of list traversal Figure 4. A termination proof of list traversal stop x = nil (ls y nil ls x nil) 8 () z = x x = nil (ls y nil (x = nil emp)) 4 ls y nil ls x nil 2 goto 2 ls y nil ls x nil 7 (Cut) y = z (ls y nil z y ls x nil) 7 y := z ls y nil z y ls x nil 6 [z] := y ls y nil z x ls x nil 5 (=) x = x z = x (ls y nil x x ls x nil) 5 x := [x] z = x (ls y nil x x ls x nil) 4 (WkA) z = x x = nil (ls y nil x x ls x nil) 4 (Case ls) z = x x = nil (ls y nil ls x nil) 4 z := x x = nil (ls y nil ls x nil) 3 if x = nil goto 8 ls y nil ls x nil 2 Figure 5. Termination of in-place list reversal This proof could have used multiplicative weakening in place of cut, but we have chosen not to do so in order to emphasise the connection with the next example. Example 6.3 (Termination of in-place "frying-pan list" reversal). The previous example algorithm will reverse a cyclic list segment, but the loop measure, and hence the proof of termination, is tricky. The precondition is (??) and the invariant is: k1, k2, k3 0 @ (ls x j ls y nil j k1 ls k1 j) (ls k2 nil j k2 ls x j ls y j) (ls x nil ls y j j k3 ls k3 j) 1 A (3) in which each of the disjuncts corresponds directly to one of the pictures in figure ??. Figure 5. Termination of in-place list reversal Figure 4. A termination proof of list traversal stop x = nil; ls x nil 4 () ; x = nil; emp 2 () x = nil; x = nil; emp 2 ls x nil 1 goto 1 ls x nil 3 (WkM) x x, ls x nil 3 (WkA) x = nil; (x x, ls x nil) 3 (=) x = x ; x = nil; (x x , ls x nil) 3 x := [x] x = nil; (x x , ls x nil) 3 () x = nil; x x ls x nil 2 (Case ls) x = nil; ls x nil 2 if x = nil goto 4 ls x nil 1 Figure 4. A termination proof of list traversal stop x = nil (ls y nil ls x nil) 8 z = x x = nil (ls y nil (x = nil emp)) 4 ls y nil ls x nil 2 goto 2 ls y nil ls x nil 7 cut y = z (ls y nil z y ls x nil) 7 y := z ls y nil z y ls x nil 6 [z] := y ls y nil z x ls x nil 5 = I x = x z = x (ls y nil x x ls x nil) 5 x := [x] z = x (ls y nil x x ls x nil) 4 WkA z = x x = nil (ls y nil x x ls x nil) 4 (Case ls) z = x x = nil (ls y nil ls x nil) 4 z := x x = nil (ls y nil ls x nil) 3 if x = nil goto 8 ls y nil ls x nil 2 Figure 5. Termination of in-place list reversal This proof could have used multiplicative weakening in place of cut, but we have chosen not to do so in order to emphasise the connection with the next example. Example 6.3 (Termination of in-place "frying-pan list" reversal). The previous example algorithm will reverse a cyclic list segment, but the loop measure, and hence the proof of termination, is tricky. A cyclic list segment ls x j in which the terminating pointer j points to a node already in the segment can be seen as a separated three-part structure of two acyclic list segments and a "join node", represented in separation logic as: k.(ls x j j k ls k j) (2) Diagrammatically, such segments resemble a frying pan in which ls x j is the handle and ls k j is the pan. The reversal algorithm goes down the handle, reversing it until it reaches the join node, which it redirects towards the reversed handle; then it goes round the pan, reversing that; then it re-redirects the join node to the reversed pan; finally it comes back up the handle, re-reversing it. The precondition is (2) and the invariant is: k1, k2, k3 0 @ (ls x j ls y nil j k1 ls k1 j) (ls k2 nil j k2 ls x j ls y j) (ls x nil ls y j j k3 ls k3 j) 1 A (3) in which each of the disjuncts corresponds directly to one of the pictures in figure 6. This invariant is sufficiently obvious that it can be discovered automatically [13], but it's hard to see what formula to use as a measure for this process, because x plays different r^oles at different stages of the proof. A termination proof could be hacked up using auxiliary variables to record the join point and the stage of the proof, but the proof stages and the join point are proof artefacts, and it should be unnecessary to reveal them. Figure 7 shows a pre-proof of the judgement I 2, where I is the invariant (3) (with existential quantifications omitted for simplicity, and double-line steps indicating implicit additive weakening (WkA)). The cycles in the proof are: B reversing the handle, progressing by unrolling ls x j; D reversing the pan, progressing by unrolling ls x j; E re-reversing the handle, progressing (like figure 5) by unrolling ls x nil; A redirect the join-node to the reversed handle, move to reverse the pan; C redirect the join-node to the reversed pan, move to re-reverse the handle. It looks more complicated than it is. Figure 4. A termination proof of list traversal stop x = nil (ls y nil ls x nil) 8 () z = x x = nil (ls y nil (x = nil emp)) 4 ls y nil ls x nil 2 goto 2 ls y nil ls x nil 7 (Cut) y = z (ls y nil z y ls x nil) 7 y := z ls y nil z y ls x nil 6 [z] := y ls y nil z x ls x nil 5 (=) x = x z = x (ls y nil x x ls x nil) 5 x := [x] z = x (ls y nil x x ls x nil) 4 (WkA) z = x x = nil (ls y nil x x ls x nil) 4 (Case ls) z = x x = nil (ls y nil ls x nil) 4 z := x x = nil (ls y nil ls x nil) 3 if x = nil goto 8 ls y nil ls x nil 2 Figure 5. Termination of in-place list reversal This proof could have used multiplicative weakening in place of cut, but we have chosen not to do so in order to emphasise the connection with the next example. Example 6.3 (Termination of in-place "frying-pan list" reversal). The previous example algorithm will reverse a cyclic list segment, but the loop measure, and hence the proof of termination, is tricky. A cyclic list segment ls x j in which the terminating pointer j points to a node already in the segment can be seen as a separated three-part structure of two acyclic list segments and a "join node", represented in separation logic as: k.(ls x j j k ls k j) (2) Diagrammatically, such segments resemble a frying pan in which ls x j is the handle and ls k j is the pan. The reversal algorithm goes down the handle, reversing it until it reaches the join node, which it redirects towards the reversed handle; then it goes round the pan, reversing that; then it re-redirects the join node to the reversed pan; finally it comes back up the handle, re-reversing it. The precondition is (??) and the invariant is: k1, k2, k3 0 @ (ls x j ls y nil j k1 ls k1 j) (ls k2 nil j k2 ls x j ls y j) (ls x nil ls y j j k3 ls k3 j) 1 A (3) in which each of the disjuncts corresponds directly to one of the pictures in figure ??. This invariant is sufficiently obvious that it can be discovered automatically [?], but it's hard to see what formula to use as a measure for this process, because x plays different r^oles at different stages of the proof. A termination proof could be hacked up using auxiliary variables to record the join point and the stage of the proof, but the proof stages and the join point are proof artefacts, and it should be unnecessary to reveal them. Figure ?? shows a pre-proof of the judgement I 2, where I is the invariant (??) (with existential quantifications omitted for simplicity, and double-line steps indicating implicit additive weakening (WkA)). The cycles in the proof are: B reversing the handle, progressing by unrolling ls x j; Figure 5. Termination of in-place list reversal In TOY-C the program becomes 1: y := nil; 5: [z] := y; 2: if x = nil goto 8; 6: y := z; 3: z := x; 7: goto 2; 4: x := [x]; 8: stop (1) With precondition ls x nil the invariant of the loop (lines 2­8) is ls y nil ls x nil, and the interesting problem is then ls y nil ls x nil 2. The proof is shown in figure 5. (This and the next example are not completely formal: we have used `' and `' rather than `,' and `;' in bunches, and we have omitted steps which reorganise bunches.) This proof could have used multiplicative weakening in place of cut, but we have chosen not to do so in order to emphasise the connection with the next example. Example 6.3 (Termination of in-place "frying-pan list" reversal). The previous example algorithm will reverse a cyclic list segment, but the loop measure, and hence the proof of termination, is tricky. A cyclic list segment ls x j in which the terminating pointer j points to a node already in the segment can be seen as a separated three-part structure of two acyclic list segments and a "join node", represented in separation logic as: k.(ls x j j k ls k j) (2) Diagrammatically, such segments resemble a frying pan in which ls x j is the handle and ls k j is the pan. The reversal algorithm goes down the handle, reversing it until it reaches the join node, which it redirects towards the reversed handle; then it goes round the pan, reversing that; then it re-redirects the join node to the reversed pan; finally it comes back up the handle, re-reversing it. 107 y x j P rev H0 H1 (a) going in y x j rev P0 rev H P1 (b) going round x y j rev P rev H0 H1 (c) coming out Figure 6. Stages of reversing a frying-pan list with handle H and pan P The precondition is (2) and the invariant is: k1, k2, k3 (ls x j ls y nil j k1 ls k1 j) (ls k2 nil j k2 ls x j ls y j) (ls x nil ls y j j k3 ls k3 j) (3) in which each of the disjuncts corresponds directly to one of the pictures in figure 6. This invariant is sufficiently obvious that it can be discovered automatically (Distefano et al. 2006), but it's hard to see what formula to use as a measure of the number of loop executions, because x plays different r^oles at different stages of the proof. A termination proof might perhaps be hacked up using auxiliary variables to record the join point and the stage of the proof, but the proof stages and the join point are proof artefacts, and it should be unnecessary to reveal them. Figure 7 shows a pre-proof of the judgement I 2, where I is the invariant (3) (with existential quantifications omitted for simplicity, and double-line steps indicating implicit additive weakening (WkA)). The cycles in the proof are: B reversing the handle, progressing by unrolling ls x j; D reversing the pan, progressing by unrolling ls x j; E re-reversing the handle, progressing (like figure 5) by unrolling ls x nil; A redirect the join-node to the reversed handle, move to reverse the pan; C redirect the join-node to the reversed pan, move to re-reverse the handle. It looks more complicated than it is. * The right-hand stack, which deals with the third disjunct of the invariant and the re-reversal of the handle, is a straight-line list reversal, extremely similar to figure 5, with `stop' as the left antecedent of the if-goto step, and contradiction (because x = nilx = nil) the left alternative of the unrolling of ls x nil; * The left-hand stack, which deals with the first disjunct of the invariant and the original reversal of the handle, is also similar to figure 5, but in place of `stop' it has contradiction (x can't be nil whilst ls x j j k1) and in place of contradiction it has a sequence of executions (when you exhaust ls x j you reach the join-node j k1, and one execution of the loop deals with that); * The middle stack is very like the left-hand one. There are no infinite paths in this proof which don't involve unrolling one or more predicates infinitely often. Hence we have termination. Incidentally we observe that the number of loop executions is now clear: it's exactly twice the length of the handle (cycles B and E) plus the length of the pan (cycle D) plus 2 (paths A and C). The proof is long and tedious, but it involves no arithmetic and is very much the sort of thing an automatic tool (see e.g. Berdine et al. (2006a)) might be expected to do. One can imagine list processing in such a tool being so stylised that it could recognise the need to unroll the ls definitions to form the cycles B, D and E in the proof, and invent the empty definitions to form the paths A and C. 7. Conclusions and future work In this paper we outline a novel approach to the problem of proving termination of imperative programs. Our approach builds upon previous theoretical work in cyclic proof (Brotherston 2007; Brotherston and Simpson 2007; Brotherston 2005; Sprenger and Dam 2003) and relies heavily upon separation logic techniques (Berdine et al. 2005; Reynolds 2002; Bornat et al. 2004). The infinite descent flavour of our cyclic proofs is highly reminiscent of Lee, Jones and Ben-Amran's size-change termination principle (Lee et al. 2001). However, since the soundness requirement for our proofs is based on unfolding an inductive definition infinitely often along every infinite path, we avoid the need to explicitly construct and reason with ranking functions. In the case of our termination proof for reversal of a frying-pan list (Example 6.3) we did not have to introduce auxiliary variables to represent phases of the algorithm or deal with the conditional measure function which would be needed to exploit knowledge of the phase (and then the phase annotation would require a lexicographical ranking function). The Terminator and Mutant automated termination proving tools rely on a theoretical result concerning well-founded relations due to Podelski and Rybalchenko (Rybalchenko et al. 2006; Berdine et al. 2006b, 2007). The relationship between this principle and our cyclic proof principle is not yet clear to us. Nor does there seem to be a straightforward comparison with termination tools based on term rewriting (see e.g. Hofbauer and Serebrenik (2007)). However, we observe that our approach is amenable to interactive as well as automatic theorem proving, and we believe that the Smallfoot assertion checking tool for separation logic (Berdine et al. 2006a) is a promising candidate platform for implementing our approach. In general, any proof search mechanism for our formalism would need to extend the usual heuristics with the notion of searching for cycles (by identifying suitable companions elsewhere in the proof tree for the current buds, which correspond to unproven subgoals). The recent work on shape analysis for separation logic (see e.g. Guo et al. (2007); Lee et al. (2005); Distefano et al. (2006)) provides one obvious direction, based on abstract interpretation, with a finite domain built from separation logic formulas. Proof search combined with abstraction immediately gives a finite number of derivation trees, and cyclic pre-proofs for free. This immediately suggests an algorithm, and we can already see that it applies to the in-place list reversal program. Thus far we have dealt only with small iterative algorithms. We already understand how to deal with iterative problems that normally require lexicographic measures; we have not yet considered more complex ranking functions (but we note that Lee and BenAmram (Ben-Amram and Lee 2007) have shown that measures 108 () x=nil (lsxjlsynil jk1lsk1j) 8 lsyjlsk2nil jk2lsxj2 goto2 lsyjlsk2nil jk2lsxj7 (Cut) y=j (emplsk2nil jk2lsxj) 7 (=) y=zz=j (emplsynil zylsxj) 7 y:=z z=j (emplsynil zylsxj) 6 [z]:=y z=j (emplsynil zxlsxj) 5 (=) x=k1z=j x=j(emplsynil xk1lsk1j) 5 x:=[x] z=jx=j (emplsynil xk1lsk1j) 4 (=) z=xx=j (emplsynil jk1lsk1j) 4 lsynillsxj jk1lsk1j)2 goto2 lsynillsxj jk1lsk1j)7 (Cut) y=z(zy lsxjlsynil jk1lsk1j) 7 y:=z zy lsxjlsynil jk1lsk1j 6 [z]:=y zx lsxjlsynil jk1lsk1j 5 (=) z=xx=x (xxlsxjlsynil jk1lsk1j) 5 ==========================x:=[x] z=x (xxlsxjlsynil jk1lsk1j) 4 =====================================================(Casels) z=xx=nil (lsxjlsynil jk1lsk1j) 4 z:=x x=nil (lsxjlsynil jk1lsk1j) 3 ifx=nilgoto8 lsxjlsyniljk1lsk1j2 () x=nil (lsk2niljk2 lsxjlsyj) 8 lsyjlsxnil jk3lsk3j2 goto2 lsyjlsxnil jk3lsk3j7 (Cut) y=j (lsxniljk3 emplsk3j) 7 (=) y=zz=j (lsxnilzy emplsyj) 7 y:=z z=j (lsxnilzy emplsyj) 6 [z]:=y z=j (lsxnilzx emplsyj) 5 (=) x=k2z=j x=j(lsk2nil xk2emplsyj) 5 x:=[x] z=jx=j (lsk2nilxk2 emplsyj) 4 (=) z=xx=j (lsk2niljk2 emplsyj) 4 lsk2niljk2 lsxjlsyj)2 goto2 lsk2niljk2 lsxjlsyj)7 (Cut) y=z (lsk2niljk2 zylsxjlsyj) 7 y:=z lsk2niljk2 zylsxjlsyj6 [z]:=y lsk2niljk2 zxlsxjlsyj5 (=) x=xz=x (lsk2niljk2 xxlsxjlsyj) 5 =========================x:=[x] z=xx=j (lsk2niljk2 xxlsxjlsyj) 4 ====================================================(Casels) z=xx=nil (lsk2niljk2 lsxjlsyj) 4 z:=x x=nil (lsk2niljk2 lsxjlsyj) 3 ifx=nilgoto8 lsk2niljk2lsxjlsyj2 stop x=nil (lsxnillsyj jk3lsk3j) 8 () z=xx=nil x=nil(emp lsyjjk3 lsk3j) 4 lsxnillsyj jk3lsk3j2 goto2 lsxnillsyj jk3lsk3j7 (Cut) y=z(zy lsxnillsyj jk3lsk3j 7 y:=z zylsxnil lsyjjk3 lsk3j 6 [z]:=y zxlsxnil lsyjjk3 lsk3j 5 (=) x=xz=x (xxlsxnil lsyjjk3 lsk3j) 5 =====================x:=[x] z=xx=nil (xxlsxnil lsyjjk3 lsk3j) 4 ls z=xx=nil (lsxnillsyj jk3lsk3j) 4 z:=x x=nil (lsxnillsyj jk3lsk3j) 3 ifx=nilgoto8 lsxnillsyjjk3lsk3j2 () (lsxjlsyniljk1lsk1j)(lsk2niljk2lsxjlsyj)(lsxnillsyjjk3lsk3j)2 Figure 7. A termination proof of in-place reversal of a frying-pan list x A B C j) D E Figure 7. A termination proof of in-place reversal of a frying-pan list 109 need not be arbitrarily complicated in size-change transition problems). We intend to consider more difficult problems such as tree algorithms: to deal with recursive algorithms we will need at least to include postconditions in our termination judgements; to deal with iterative tree algorithms we will have to consider subtle termination measures. We do not claim to have invented a panacea. We have uncovered a novel approach to termination which gives natural-seeming proofs which has already, in the frying-pan list example, simplified a previously difficult problem and which appears to have the potential to be extended in other directions. References Peter Aczel. An introduction to inductive definitions. In Jon Barwise, editor, Handbook of Mathematical Logic, pages 739­782. N-H, 1977. Amir M. Ben-Amram and Chin Soon Lee. Ranking functions for size change termination II. Presented at (Hofbauer and Serebrenik 2007), 2007. J. Berdine, C. Calcagno, and P.W. O'Hearn. Symbolic execution with separation logic. In APLAS 2005, volume 3780 of LNCS, pages 52­68, 2005. J. Berdine, C. Calcagno, and P.W. O'Hearn. Smallfoot: Automatic modular assertion checking with separation logic. In FMCO, volume 4111 of LNCS, pages 115­137, 2006a. J. Berdine, B. Cook, D. Distefano, and P. O'Hearn. Automatic termination proofs for programs with shape-shifting heaps. In CAV, volume 4144 of LNCS, pages 386­400, 2006b. J. Berdine, A. Chawdhary, B. Cook, D. Distefano, and P. O'Hearn. Variance analyses from invariance analyses. In 34th POPL, 2007. Richard Bornat, Cristiano Calcagno, and Peter O'Hearn. Local reasoning, separation and aliasing. In SPACE Workshop, 2004. James Brotherston. Cyclic proofs for first-order logic with inductive definitions. In B. Beckert, editor, TABLEAUX 2005, volume 3702 of LNAI, pages 78­92. Springer-Verlag, 2005. James Brotherston. Formalised inductive reasoning in the logic of bunched implications. In SAS-14, volume 4634 of LNCS, pages 87­103. SpringerVerlag, August 2007. James Brotherston. Sequent Calculus Proof Systems for Inductive Definitions. PhD thesis, University of Edinburgh, November 2006. James Brotherston and Alex Simpson. Complete sequent calculi for induction and infinite descent. In LICS-22, pages 51­60. IEEE Computer Society, July 2007. C. Calcagno, D. Distefano, P.W. O'Hearn, and H. Yang. Beyond reachability: Shape abstraction in the presence of pointer arithmetic. In SAS, volume 4134 of LNCS, pages 182­203, 2006. D. Distefano, P. O'Hearn, and H. Yang. A local shape analysis based on separation logic. In TACAS, volume 3920 of LNCS, pages 287­302, 2006. Bolei Guo, Neil Vachharajani, and David I. August. Shape analysis with inductive recursion synthesis. In PLDI, June 2007. Dieter Hofbauer and Alexander Serebrenik. The 9th international workshop on termination. Paris, France, 2007. Chin Soon Lee, Neil D. Jones, and Amir M. Ben-Amram. The size-change principle for program termination. In 28th POPL, 2001. O. Lee, H. Yang, and K. Yi. Automatic verification of pointer programs using grammar-based shape analysis. In ESOP, volume 3444 of LNCS, pages 124­140, 2005. Per Martin-L¨of. Haupstatz for the intuitionistic theory of iterated inductive definitions. In J.E. Fenstad, editor, Proceedings of the Second Scandinavian Logic Symposium, pages 179­216. N-H, 1971. P. W. O'Hearn and D. J. Pym. The logic of bunched implications. Bulletin of Symbolic Logic, 5(2):215­244, June 99. J. C. Reynolds. Separation logic: A logic for shared mutable data structures. In LICS, pages 55­74, 2002. A. Rybalchenko, B. Cook, and A. Podelski. Termination proofs for systems code. In PLDI, pages 415­426, 2006. Ulrich Sch¨opp and Alex Simpson. Verifying temporal properties using explicit approximants: Completeness for context-free processes. In FoSSaCS 2002, volume 2303 of LNCS, pages 372­386. Springer-Verlag, 2002. Christoph Sprenger and Mads Dam. On the structure of inductive reasoning: circular and tree-shaped proofs in the -calculus. In FOSSACS 2003, volume 2620 of LNCS, pages 425­440, 2003. N. Torp-Smith, L. Birkedal, and J. Reynolds. Local reasoning about a copying garbage collector. In POPL, pages 220­231, 2004. H. Yang and P. O'Hearn. A semantic basis for local reasoning. In 5th FOSSACS, LNCS 2303, 2002. A. Outline proof of soundness (Theorem 5.5) In this section we give a sketch of the proof of our main soundness result: Theorem 5.5. The proof employs the following auxiliary definition: Definition A.1. Let be a bunch and let be the position of a distinguished leaf F of . We observe that can be inductively defined (up to coherent equivalence ) by the following grammar, where F ranges over formulas: ::= F (F = F ) | ; | , ::= F | ; | , Now let s be a stack and h be a heap. We define the relation s by induction on the structure of (as given above) as follows: s, h0 |= F h0, h0 s F , F h, h0 s , F s, h |= h, h0 s (; ), F h1, h0 s , F s, h2 |= h1 h2, h0 s (, ), F Intuitively, h, h s , F holds iff s and h satisfy , F is a leaf of and h is the sub-heap of h that satisfies F . In other words, the splitting of into F and a surrounding "bunch context" is mirrored by the splitting of h into h and a surrounding "heap context". Lemma A.2. Each of the Hoare logic rules for termination judgements given in Figure 3 enjoy the following two properties: 1. if the conclusion of the rule, say i, is invalid, i.e. there is some stack s and heap h such that s, h |= but (i, s, h) does not hold, then there is some premise i of the rule, a stack s and a heap h such that s , h |= F but (i , s , h ) does not hold; 2. if there is a trace (, ) following the edge ( i, i ) then, given a heap h0 satisfying h, h0 s , F , there exists a heap h0 such that h , h0 s , F . Furthermore, the following relation holds (and is well-defined): least s.t. s, h0 |=[PP] F least s.t. s , h0 |=[PP] F where |=[PP] is the satisfaction relation defined as in Figure 2, except that for all i {1, . . . , n} we have [[Pi]] = P i , i.e. each inductive predicate is interpreted using its th approximant (cf. Definition 3.4). Furthermore, if (, ) is a progressing trace, then this relation holds with > in place of . Proof. (Sketch) We just need to check that both properties of the lemma hold for each proof rule. The first property is just one way of stating that the rules are locally sound, i.e. that falsifiability of the conclusion of a rule implies the falsifiability of one of its premises 110 (cf. Proposition 4.5). For the second property, we need to show that if there is a trace following the edge from the conclusion to this falsifiable premise and h, h0 s , F holds, i.e. h0 is the sub-heap of h used to satisfy F in the falsifying interpretation of i, then we can construct a suitable substate h0 of h that can be used to satisfy F in the constructed falsifying interpretation of i . The main interesting case is when the rule applied is a case-split rule (Case P) and (, ) is a progressing trace, with the strict inequality relying on the fact that if the formula Pt unfolded by the rule is satisfied by s and h0, i.e. (h0, [[t]]s) is in some approximant P of P, then for every case-descendant Qu of Pt we must have (h0, [[u]]s) in some strictly smaller approximant Q< of Q. Having proved the above lemma, concerning edges in a proof tree, we can straightforwardly extend the two properties of the lemma to cover paths in a pre-proof graph: Lemma A.3. Let P be a pre-proof of 0 i0 and suppose that 0 i0 is invalid. Then there exists an infinite path (j ij )j0 in GP , a sequence (sj)j0 of stacks and a sequence (hj)j0 of heaps such that the following two properties hold: 1. for all j 0, the judgement j ij is false with respect to the stack sj and heap hj; 2. if there is a trace (j)jm following a tail (j ij )jm of the path (j ij )j0, then there exists a second sequence of heaps (hj)jm such that, for all j m: least s.t. sj, hj |= F least s.t. sj+1, hj+1 |= F Furthermore, if j is a progress point of the trace, then this relation holds with > in place of . Proof. 0 i0 , s0 and h0 are given by assumption. If we inductively assume that we have constructed k ik , sk and hk, then property 1 of Lemma A.2 tells us that we can construct k+1 ik+1 , sk+1 and hk+1. Now if we suppose that there is a trace following some tail (j ij )mjk+1 of the path constructed so far, property 2 of Lemma A.2 tells us that we can construct the required sequence (hj)mjk+1. (It is easy to see how to construct the first element hm of this sequence because we have hm, hm sm m, Fm .) Proof of Theorem 5.5. If we suppose that i has a proof P but is invalid, i.e. false in some stack s and heap h, then we can use property 1 of Lemma A.3 to construct an infinite path in GP together with a sequence of stacks and heaps that falsify each sequent along the path. Since P is a proof, there is an infinitely progressing trace following some tail of . Thus we can invoke property 2 of Lemma A.3 to create a monotonically decreasing chain of ordinals which, since the trace progresses infinitely often, must decrease infinitely often. This contradicts the well-foundedness of the ordinals, so i must indeed be valid. B. Outline proof of Theorem 5.7 (via termination weakest preconditions) In this section we give a sketch of the proof of Theorem 5.7. First, we present a construction that transforms a program into a family of mutually defined inductive predicates, which capture the weakest precondition for termination of the program. Then, we show that every valid termination judgement has a cyclic proof in our system, which uses the inductive predicates obtained from the program. Consider a program 1 : C1; ; n : Cn, and let x be the variables occurring in the program. For each program point i, we define a corresponding inductive predicate termi x in Figure 8 (we use the notation E- as an abbreviation for x. Ex). The result is a collection of predicates such that cycles in the definitions correspond directly to cycles in the control flow of the program. Example B.1 (List deletion program). The list deletion program: 1: if x = nil goto 6; 4: free(t); 2: t := x; 5: goto 1; 3: x := [x]; 6: stop; gives the following definitions. (x = nil term6 xt) term1 x t (x = nil term2 x t) term1 x t term3 x x term2 x t x x ((xx ) -- term4 x t) term3 x t (t -) term5 x t term4 x t term1 x t term5 x t term6 x t By applying simplifications we obtain a single inductive predicate for location 1. First notice that inlining term3, term4, term5 we obtain x x (x x -- (x - term1 x x)) term2 x t and the left-hand side can be simplified to x x term1 x x. By further inlining and simplification, and noticing that the parameter t is not used actively in the definition, we obtain x = nil term1 x x x term1 x term1 x which is exactly analogous to ls x nil except that we can have garbage in the heap, since garbage does not affect termination. Example B.2 (In-place list reversal program). Consider the inplace list reversal program presented in Example 6.2. The corresponding inductive definition for termination at program point 2 can be simplified to obtain the following definition: x = nil term2 x y x x (x y -- term2 x x) term2 x y In this inductive definition we cannot eliminate --. Instead we can give a characterisation using another predicate. If we define the cyclic list predicate cl by: ls x x x x ls x x cl x then the following equivalence holds: term2 x y (ls x nil) (cl x ls y nil) which means that the program terminates either by traversing the acyclic list starting from x and ending in nil, or by traversing the cyclic list and then the acyclic list starting from y and ending in nil. The following lemma shows that the termi predicate indeed guarantees termination from program point i. Lemma B.3. (i, s, h) implies s, h |= termi x. Proof. The proof is by induction on the length n of the longest computation (i.e., -sequence) starting at (i, s, h). We show some cases; other cases are analogous. Case Ci x := E. The computation proceeds with (i + 1, s , h), where s = s[x [[E]]s]. Then we have (i + 1, s , h) with longest computation of length n - 1. By induction hypothesis we have s , h |= termi+1 x, therefore s, h |= termi+1 (x[E/x]). By definition of termi we have s, h |= termi x, which concludes the case. 111 Command Inductive definition Ci x := E termi+1(x[E/x]) termi x Ci x := [E] E x (E x -- termi+1(x[x /x])) termi x Ci [E] := F (E -) ((E F) -- termi+1 x) termi x Ci x := new() x , y . (x y ) -- termi+1(x[x /x]) termi x Ci free(E) (E-) termi+1 x termi x Ci if Cond goto j Cond termj x termi x Cond termi+1 x termi x Ci stop termi x Figure 8. Transformation of commands to inductive predicates Case Ci x := new(). We need to show s, h |= termi x, that is s, h |= x , y . (x y ) -- termi+1(x[x /x]) for x , y fresh in x. Take v1, v2 Val, and h such that s[x v1, y v2], h |= (x y ) and h h is defined. Then h = [v1 v2] and v1 Loc \ dom(h). It remains to show s[x v1, y v2], h[v1 v2] |= termi+1 (x[x /x]). Now the computation can proceed with (i+1, s[x ], h[ v]) for any Loc\dom(h) and v Val, in particular with = v1 and v = v2. Then we have (i + 1, s[x v1], h[v1 v2]) with longest computation of length n - 1. By induction hypothesis we have s[x v1], h[v1 v2] |= termi+1 x, therefore s[x v1, y v2], h[v1 v2] |= termi+1 (x[x /x]), as required. The following lemma shows how to construct a cyclic proof that termi is a termination precondition for program point i, which will be fundamental for the completeness result. Together with soundness of cyclic proofs (Theorem 5.5) and Lemma B.3, this implies that termi denotes the weakest precondition for termination at program point i. Lemma B.4. There is a cyclic proof of termi x i. Proof. We give a direct construction of the proof. For each command Ci in the program, let j1 jk be the possible program points where the execution might continue after executing Ci. We show how to construct a derivation tree of the form (termm x m)m=j1jk termi x i which admits a progressing trace from termi x i to each termm x m. The whole proof is then obtained by stacking the appropriate derivation tree on top of each bud, unless that bud is already matched with a companion in the derivation tree already constructed. At the end of the process, each bud will be assigned a companion, and every infinite path in the resulting graph will obviously progress infinitely often. We show in detail the construction for some interesting cases: Case Ci x := E. We have the following inductive definition for termi x: termi+1(x[E/x]) termi x We can derive: termi+1 x i+1 (WkA) x = E[x /x] ; termi+1(x) i+1 (=) x = E[x /x] ; termi+1(x[E[x /x]/x]) i+1 x := E termi+1(x[E/x]) i (Case termi) termi x i Case Ci x := new(). We have the following inductive definition for termi x: x , y . (x y ) -- termi+1 (x[x /x]) termi x We can derive: termi+1 x i+1 (--) x x , ((xx ) -- termi+1 (x[x /x][x/x ])) i () x x , (x , y . (x y ) -- termi+1 (x[x /x])) i+1 x := new() x , y . (x y ) -- termi+1 (x[x /x]) i (Case termi) termi x i Case Ci if Cond goto j. We have the following inductive definition for termi x: Cond termj x termi x Cond termi+1 x termi x We can derive: termj x j (WkA) Cond; termj x j if Cond; termj x i () Cond termj x i termi+1 x i+1 (WkA) Cond; termi+1 x i+1 if Cond; termi+1 x i () Cond termi+1 x i (Case termi) termi x i Proof of Theorem 5.7. Assume that i is valid. For any s Stacks and h Heaps such that s, h |= , we have (i, s, h), hence s, h |= termi x by Lemma B.3. Therefore termi x is valid, and it is derivable by the completeness assumption of the underlying proof system. The required proof of i can be constructed using an instance of Cut as follows termi x i termi x (Cut) i where the dots are a placeholder for the proof of termi x i from Lemma B.4. 112