Dynamic Datastructures in ACL2: A Challenge David Greve and Matt Wilding Nov. 2002 Background ---------- Arrays, structures, and stacks are common data structures that can be found in some form in nearly every useful software system. The utility that such structures provide in implementing complex systems, however, hinges on the often overlooked assumption of independence. What the programmer often takes for granted - that modifying structure X will not change the value of structure Y - becomes a significant burden when the formal correctness of the final implementation is considered in the context of a linear address space. Several verification projects with which we are familiar have faced this challenge, including Moore's Piton compiler, Bevier's KIT proof, and Yu's 68020 code proofs. In each project the disjointness of data structures was described and verified as part of the larger verification effort. However, the proof architecture in each of these projects involves blocks of data that resided within distinct ranges rather than arbitrary, dynamic structures. This approach is not feasible for the kinds of complex, pointer-rich data structures we encounter in proofs about microprocessor microarchitectures. A first challenge ----------------- We start with two functions, a-mark-objects and b-mark-objects, that operate on "A" and "B" type data structures, respectively. The functions a-collect and b-collect crawl over their respective data structures collecting lists of addresses used by the structures. The function compose-bab is a composition of two a-mark-objects and a b-mark-object. (Note the definitions in the code below.) Prove the following theorems: ;; The read of an arbitrary address location outside of the ;; data structure modified by a-mark-objects is not affected ;; by a-mark-objects. (defthm rd-read-over-a-mark-objects (implies (let ((list (a-collect ptr n ram))) (and (not (member addr list)) (unique list))) (equal (g addr (a-mark-objects ptr n ram)) (g addr ram)))) ;; The read of an arbitrary address location outside of the ;; union of the three data structures modified by compose-bab ;; is not effected by compose-bab. (defthm read-over-bab (implies (let ((list (append (b-collect ptr1 n1 ram) (a-collect ptr2 n2 ram) (b-collect ptr3 n3 ram) ))) (and (not (member addr list)) (unique list))) (equal (g addr (compose-bab ptr1 n1 ptr2 n2 ptr3 n3 ram)) (g addr ram)))) ;; a-mark-objects and b-mark-objects commute when their respective ;; data structures are disjoint. (defthm a-mark-over-b-mark (implies (let ((list (append (a-collect ptr1 n1 ram) (b-collect ptr2 n2 ram)))) (unique list)) (equal (a-mark-objects ptr1 n1 (b-mark-objects ptr2 n2 ram)) (b-mark-objects ptr2 n2 (a-mark-objects ptr1 n1 ram)))) :hints (("goal" :in-theory (enable a-mark-objects-open)))) More challenges --------------- 1) Add fields to the "A" and "B" data structures and measure the performance of your proof. You may have constructed a proof that scales quadratically with the number of entries in the data structure. Is there a more efficient solution? 2) The proof of read-over-bab suggests that many theorems will need to be proved about a system with many unique functions and data structures. Generalize the notion of "collect" so that only a single version of it need be introduced. (And, therefore, only a single "read-over-write theorem need be proved for each data structure operation.) One approach for accomplishing this would be to provide a structure "map" as an argument to "collect". 3) In a system with many functions and data structures, it would be tedious to prove the various "a-mark-over-b-mark"-type theorems for all possible combinations of functions and data structures. Is there some technique that would allow us to state and prove, in general, that for all functions (f1,f2) that operate only on well defined data structures (d1,d2) we get: (defthm xxx (implies (unique (append d1 d2)) (equal (f1 (f2 ram)) (f2 (f1 ram))))) Can the application of this theorem be automated? 4) The proposed representation of data structures is somewhat limited. Circular data structures, data structures with several pointers to one node, are precluded by the assumption of uniqueness. How might such structures be incorporated into the reasoning framework developed so far? 5) What other issues might be encountered in the application of these techniques to real-world problems? Code ---- ; We assume that two books have been loaded, part of the ACL2 ; distribution: ; (include-book "XXX/books/misc/records") ; (include-book "XXX/books/arithmetic/top-with-meta") (defun seq-int (start len) (if (zp len) nil (cons start (seq-int (1+ start) (1- len))))) (defun unique (list) (if (consp list) (and (not (member (car list) (cdr list))) (unique (cdr list))) t)) ;; We introduce two data structures. ;; "A" nodes have 4 words. Words 0 and 3 are scalars and words 1 and ;; 2 are A node pointers. ;; Collect addresses in the first n nodes of an a structure (defun a-collect (ptr n ram) (declare (xargs :measure (nfix n))) ; +---+ ; 0: | | ; +---+ ; 1: | o-+--> a-node ; +---+ ; 2: | o-+--> a-node ; +---+ ; 3: | | ; +---+ (if (zp n) nil (if (zp ptr) nil (append (seq-int ptr 4) (a-collect (g (+ 1 ptr) ram) (1- n) ram) (a-collect (g (+ 2 ptr) ram) (1- n) ram) )))) ;; "B" nodes have 3 words. Word 2 is an integer, words 0 and 1 are ;; B-node pointers. "0" is a null pointer ;; Collect addresses from the first n nodes of a b structure (defun b-collect (ptr n ram) (declare (xargs :measure (nfix n))) ; +---+ ; 0: | o-+--> b-node ; +---+ ; 1: | o-+--> b-node ; +---+ ; 2: | | ; +---+ (if (zp n) nil (if (zp ptr) nil (append (seq-int ptr 3) (b-collect (g ptr ram) (1- n) ram) (b-collect (g (+ 1 ptr) ram) (1- n) ram) )))) ;; Crawl through at most n nodes in an "a" structure along the second ;; pointer. Modify word 0 of each node by adding word 2 to it. (defun a-mark-objects (addr n ram) (declare (xargs :measure (nfix n))) (if (zp n) ram (if (zp addr) ram (let ((ram (s addr (+ (g addr ram) (g (+ 2 addr) ram)) ram))) (a-mark-objects (g (+ addr 2) ram) (1- n) ram))))) ;; Crawl through at most n nodes in an "b" structure along the pointer ;; in the first field (defun b-mark-objects (addr n ram) (declare (xargs :measure (nfix n))) (if (zp n) ram (if (zp addr) ram (let ((ram (s (+ 2 addr) 0 ram))) (b-mark-objects (g addr ram) (1- n) ram))))) (defun compose-bab (ptr1 n1 ptr2 n2 ptr3 n3 ram) (let ((ram (b-mark-objects ptr1 n1 ram))) (let ((ram (a-mark-objects ptr2 n2 ram))) (let ((ram (b-mark-objects ptr3 n3 ram))) ram))))