Finalization and Related Issues

Luke Tierney

1999/10/25

Introduction

If a lisp object is used to represent an external object [In the Lisp tradition I'm using ``object'' here to mean generic entity, rather than an object-oriented programming object.] , then it may be useful to schedule some cleanup operations to occur when the last reference to the object is garbage collected. This is often called finalization. Here is a first attempt at a finalization system.

Two additional, somewhat related, issues are

Arranging for an external object to be ``suspended'' when it is stored in a saved workspace and the system is about to exit, and having the object restored when the saved workspace is restored.
Arranging for some cleanup operations to occur when the system terminates.

Initial proposals for dealing with these are also presented here.

Finalization

The finalization mechanism uses a single function, register-finalizer, to arrange for a function (the second argument) to be called when the first argument is garbage collected. More precisely, if after a garbage collection the only reference to the first argument is through its registration in the finalization system, then at the end of the garbage collection process the object's finalization function is called with the object as its argument . The object itself will not actually be reclaimed until the following garbage collection, assuming no new references to it where created by the finalization function. At present there is no way to remove a registration, and there is no check to make sure an object is registered only once.

As an example,

> (register-finalizer (copy-seq "hello") 'print)
NIL
> (gc)

"hello" 
NIL

The copy-seq is needed since the expression itself is saved as the value of the + variable.

Errors in finalization functions currently print a message to the listener but don't interrupt processing of pending finalizations. This needs to be thought through a bit more. ****

Saved Workspaces

If an object requires some action to be performed before it is stored in a saved workspace, register the action with an expression of the form

(register-saver object fun)

with fun a function of one argument, the object. A saver can be removed with

(unregister-saver object)

At the other end, if an object requires some processing to restore it, register a restoration function with

(register-restorer object fun)

and remove the registration with

(unregister-restorer object)

The savers are called by save-workspace before committing the save. The restorers are called in the initialization process after the standard initialization function. This may not be quite the best place---it needs to be thought through a bit more. ****

Here is a quick example:

> (load "svrstr")
; loading svrstr.fsl
T
> (setf x "fred")
"fred"
> (register-saver x #'(lambda (x) (format t "saving ~s~%" x)))
#<Closure: #4006d0e0>
> (register-restorer x #'(lambda (x) (format t "restoring ~s~%" x)))
#<Closure: #4006ceb0>
> (save-workspace "test.wks")
saving "fred"
nokomis% xlisp -wtest.wks 
XLISP-PLUS version 3.0
Portions Copyright (c) 1988, by David Betz.
Modified by Thomas Almy and others.
XLISP-STAT Release 3.50 (Beta).
Copyright (c) 1989-1994, by Luke Tierney.

restoring "fred"
>

Exit Handlers

At the moment, there is only an internal C mechanism for installing exit handlers. In particular, xlatexit is used just like C's atexit.

Eventually it would probably be useful to have Lisp exit handlers, but I'm not quite sure exactly where they should go at this point since there are several ways to exit (calling exit and signaling an EOF to the listener on UNIX, for example) and it isn't clear how to handle both of these. ****

Outline of an Example

As an example, suppose you are building a socket interface. Your system might then do the following:

Maintain an internal list of all active sockets.
Use finalization to make sure sockets are closed on GC.
Use saver functions to make sure sockets are closed on saving to a workspace. If it is meaningful, the restorer might attempt to re-establish the connection of a socket that was open at close time (probably not a good idea, but it could be done).
An exit handler can use the internal list to make sure all sockets are closed on exit. MacOS in particular seems to be rather fussy about this.

Installing the Code

This code will probably be included in 3.51 (perhaps with some modifications). If you want to try it with 3.50 you need to do the following:

Apply the finalization patch with
```
patch < finalize.patch
```
at the root of the source tree and make.
Apply the atexit patch with
```
patch < atexit.patch
```
at the root of the source tree and make.
Compile and load the file svrstr.lsp and create a new saved workspace.

Finalization Implementation

The finalization system requires a few changes in xldmem.c and adding an entry to the function table. At the moment these changes will only work for the generational collector. I don't use the mark-and-sweep collector, but it would be fairly simple to make things work there to. This implementation produces a patch file to be applied to the 3.50 sources.

At the moment, finalization entries are stored in finalize_registered as a simple list of (object . fun) pairs. The external function for installing a pair is

<xregfinal>= (U->)
> LVAL xregfinal(V)
> {
>   LVAL arg, fun;
>   arg = xlgetarg();
>   fun = xlgetarg();
>   xllastarg();
>   finalize_registered = cons(cons(arg, fun), finalize_registered);
>   return NIL;
> }

Defines xregfinal (links are to index).

The declarations in xlbfun.h need to be changed,

<add xregfinal declaration>= (U->)
<     xgc(V),xexpand(V),xalloc(V),xmem(V);
---
>     xgc(V),xexpand(V),xalloc(V),xmem(V),xregfinal(V);

Defines xregfinal (links are to index).

and the xlftab.c entry for register-finalizer needs to be added,

<add register-finalizer function table entry>= (U->)
> {   "REGISTER-FINALIZER",S, xregfinal    },

Defines register-finalizer (links are to index).

In the garbage collection, after all other roots have been completely processed we check the registered objects to see if any have no references other that the one from their registration -- these are the objects for which is_new_node returns TRUE. The corresponding list entries are removed from the registrations list and put on the finalize_pending list. These operations are destructive so no allocation occurs.

<check_finalize>= (U->)
> static VOID check_finalize(V)
> {
>   LVAL last = NIL, next = finalize_registered, head, tail;
> 
>   while (consp(next)) {
>     if (is_new_node(car(car(next)))) {
>       head = next;
>       tail = cdr(next);
>       if (null(last))
>         finalize_registered = tail;
>       else
>         Rplacd(last, tail);
>       next = tail;
>       Rplacd(head, finalize_pending);
>       finalize_pending = head;
>     }
>     else {
>       last = next;
>       next = cdr(next);
>     }
>   }
> }

Defines check_finalize (links are to index).

After check_finalize is called, if any objects are registered or pending their lists need to be processed and any resulting additional forwarding has to take place. **** The tests for NIL could be skipped.

<check for finalization and process pending and registered lists>= (U->)
>   check_finalize();
>   if (finalize_registered != NIL || finalize_pending != NIL) {
>     forward_node(finalize_registered);
>     forward_node(finalize_pending);
>     while ((tmp = forwarded_nodes) != NIL) {
>       forwarded_nodes = NEXT_NODE_PTR(tmp);
>       forward_children(tmp);
>       TENURE_NODE(tmp);
>       unset_to_new_node(tmp);
>     }
>   }

At the end of the garbage collection the pending finalizations are processed.

<process pending finalizations>= (U->)
>   if (! null(finalize_pending))
>     do_finalize();

The do_finalize function puts up a context for catching errors and non-local exits and then successively removes each pending entry and processes it. This ought to be modified to eat the error message, as in ignore-errors.

<do_finalize>= (U->)
> static VOID do_finalize(V)
> {
>   CONTEXT cntxt;
>   LVAL next;
> 
>   xlsave1(next);
>   xlbegin(&cntxt,CF_UNWIND|CF_ERROR,NIL);
>   setjmp(cntxt.c_jmpbuf);
>   while (consp(finalize_pending)) {
>     next = finalize_pending;
>     finalize_pending = cdr(next);
>     xlapp1(cdr(car(next)), car(car(next)));
>   }
>   xlend(&cntxt);
>   xlpop();
> }

Defines do_finalize (links are to index).

The declarations added to xldmem.c are

<new declarations for xldmem.c>= (U->)
> static LVAL finalize_registered, finalize_pending;
> static VOID check_finalize(V);
> static VOID do_finalize(V);

Defines check_finalize, do_finalize, finalize_pending, finalize_registered (links are to index).

and the registration and pending lists are initialized by

<initialize registration and pending lists>= (U->)
>     finalize_registered = NIL;
>     finalize_pending = NIL;

The rest of the patch file is

<finalize.patch>=
Index: xldmem.c
===================================================================
RCS file: /NOKOMIS/users/luke/SRC/xlispstat/xldmem.c,v
retrieving revision 1.29
diff -r1.29 xldmem.c
9a10,13
<new declarations for xldmem.c>
> 
311a316,318
> 
<initialize registration and pending lists>
1107a1115,1125
<check for finalization and process pending and registered lists>
1133a1152,1153
<process pending finalizations>
1630a1651,1653
> 
<initialize registration and pending lists>
2864a2888,2937
> 
<check_finalize>
> 
<do_finalize>
> 
<xregfinal>
Index: xlftab.c
===================================================================
RCS file: /NOKOMIS/users/luke/SRC/xlispstat/xlftab.c,v
retrieving revision 1.20
diff -r1.20 xlftab.c
309a310
<add register-finalizer function table entry>
826d826
< {   NULL,               S, xnotimp      },
Index: xlftab.h
===================================================================
RCS file: /NOKOMIS/users/luke/SRC/xlispstat/xlftab.h,v
retrieving revision 1.21
diff -r1.21 xlftab.h
124c124
<add xregfinal declaration>

**** At the moment, the finalization happens under the interrupt suspension and with the gc cursor active -- should that be changed?

Exit Handler Implementation

At the moment all this does is define the xlatexit macro to call atexit. The following patch puts this define into xlisp.h. Since the implementation currently does not use any exit handlers all the minimum 32 handlers mandated by ANSI/ISO C should still be available.

<atexit.patch>=
Index: xlisp.h
===================================================================
RCS file: /NOKOMIS/users/luke/SRC/xlispstat/xlisp.h,v
retrieving revision 1.58
diff -r1.58 xlisp.h
1849a1850,1851
> #define xlatexit(f) atexit(f)
>

Defines xlatexit (links are to index).

Eventually I will probably expand this to make sure that space for additional exit handlers is allocated if needed even if the base system atexit does not do this. In addition, on the Mac it may be useful to be more aggressive about making sure exit handlers are called even in abnormal exit situations (assuming the OS survives). Tcl 8.0 seems to go to great lengths on this -- I'll need to look into that more carefully. ****

Saver/Restorer Implementation

The saver and restorer functions are stored in hash tables.

<set up saver and restorer hash tables>= (U->)
(defvar *saver-table* (make-hash-table))
(defvar *restorer-table* (make-hash-table))

Defines *restorer-table*, *saver-table* (links are to index).

Savers are installed and removed by

<define register-saver and unregister-saver>= (U->)
(defun register-saver (arg fun)
  (setf (gethash arg *saver-table*) fun))

(defun unregister-saver (arg)
  (remhash arg *saver-table*))

Defines register-saver, unregister-saver (links are to index).

<define register-restorer and unregister-restorer>= (U->)
(defun register-restorer (arg fun)
  (setf (gethash arg *restorer-table*) fun))

(defun unregister-restorer (arg)
  (remhash arg *restorer-table*))

Defines register-restorer, unregister-restorer (links are to index).

The new definition of save-workspace inserts a maphash over the savers after removing the hardware objects. Eventually the hardware object removal can be pushed into the savers.

<redefine save-workspace>= (U->)
(defun save-workspace (name)
  (dolist (h (copy-list *hardware-objects*))
    (send (third h) :remove))
  (maphash #'(lambda (arg fun) (funcall fun arg)) *saver-table*)
  (save name)
  (exit))

Defines save-workspace (links are to index).

The function to run the restorers is

<define run-restorers>= (U->)
(defun run-restorers ()
  (maphash #'(lambda (arg fun) (funcall fun arg)) *restorer-table*))

Defines run-restorers (links are to index).

and it is placed in the *startup-functions* queue by

<install run-restorers in initialization functions>= (U->)
(unless (member 'run-restorers *startup-functions*)
        (setf *startup-functions*
              (append *startup-functions* '(run-restorers))))

**** This may not be the best place for it --- think it through.

The file to install all this is svrstr.lsp. Eventually this should be put in standard system files. For now, just load this file and make a new saved workspace.

<svrstr.lsp>=
(in-package "XLISP")

(export '(register-saver unregister-saver
          register-restorer unregister-restorer))

<set up saver and restorer hash tables>
<define register-saver and unregister-saver>
<define register-restorer and unregister-restorer>
<redefine save-workspace>
<define run-restorers>
<install run-restorers in initialization functions>

**** One problem with this lisp-only implementation is that finalization and save/restore handling don't work together quite right: if you register a save/restore handler the object will never be GC'd. One way around this would be tom make the save/restore handlers an internal root that is only processed after finalizations are checked for. Another possibility would be to add some form of weak pointers or weak hash tables. I'm not sure about the right approach.

Code Chunks

<add register-finalizer function table entry>: D1, U2
<add xregfinal declaration>: D1, U2
<atexit.patch>: D1
<check for finalization and process pending and registered lists>: D1, U2
<check_finalize>: D1, U2
<define register-restorer and unregister-restorer>: D1, U2
<define register-saver and unregister-saver>: D1, U2
<define run-restorers>: D1, U2
<do_finalize>: D1, U2
<finalize.patch>: D1
<initialize registration and pending lists>: D1, U2
<install run-restorers in initialization functions>: D1, U2
<new declarations for xldmem.c>: D1, U2
<process pending finalizations>: D1, U2
<redefine save-workspace>: D1, U2
<set up saver and restorer hash tables>: D1, U2
<svrstr.lsp>: D1
<xregfinal>: D1, U2