[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Dumping data structures with circular references.



Received: from THOMAS.LAAC-AI.Dialnet.Symbolics.COM by ALAN.LAAC-AI.Dialnet.Symbolics.COM via CHAOS with CHAOS-MAIL id 5020; Sat 4-Nov-89 15:06:59 PST
Date: Sat, 4 Nov 89 15:06 PST
From: Robert D. Pfeiffer <RDP@ALAN.LAAC-AI.Dialnet.Symbolics.COM>
Subject: Dumping data structures with circular references.
To: SLUG@ALAN.LAAC-AI.Dialnet.Symbolics.COM
In-Reply-To: <19891103215714.7.RSL@MAX-FLEISCHER.ILA-SF.Dialnet.Symbolics.COM>,
             <8911040016.AA15381@uunet.uu.net>
Message-ID: <19891104230648.7.RDP@THOMAS.LAAC-AI.Dialnet.Symbolics.COM>

    Date: Fri, 3 Nov 89 13:57 PST
    From: rsl@MAX-FLEISCHER.ILA-SF.Dialnet.Symbolics.COM (Richard Lamson)

    [I am not addressing this one directly to RDP, since both of your addresses
    bounced back to me yesterday.  What's the right way to send you mail?]

[Yes, I'm not surprised.  We've got too many diverse machines involved in
handling (and manipulating the addresses of) our mail.  I thought this
issue was more or less straightened out at this point -- but who knows?
Anyway, to the best of my knowledge, I should be reachable with the
following address:

RDP@ALAN.KAHUNA.DECNET.LOCKHEED.COM

If that doesn't work, I guess just replying publically on the SLUG list
is the best way.  (And besides, hopefully these conversations are
interesting/useful to other SLUG members, anyway.)]

	Date: Fri, 3 Nov 89 09:19 PST
	From: Robert D. Pfeiffer <RDP@ALAN.LAAC-AI.Dialnet.Symbolics.COM>

	Many thanks for the replies so far (keep those cards and letters
	coming!).  I'm pursuing one additional item on this topic and it
	involves dumping the data in two passes.  The problem is that
	SYS:DUMP-FORMS-TO-FILE takes a filename rather than a stream and it
	doesn't allow me to say "append".  

    Actually, even if you could "append", it writes SI:BIN-OP-EOF into the file
    so loading the file would not read past the first EOF anyway.

OK, there's one pitfall to avoid.

	Anyone have an idea how best to solve this?  Maybe write two separate
	files with SYS:DUMP-FORMS-TO-FILE and then append them after the fact?
	Is it easy to append two binary files (or are there subtleties of the
	file format that will screw me)?  Or if I had the equivalent of
	SYS:DUMP-FORMS-TO-STREAM I guess I would be home free.  Then, I could
	simply do my own WITH-OPEN-FILE once and call SYS:DUMP-FORMS-TO-STREAM
	twice on it.

    Here is a synopsis of what I did the last time I needed to do this:

      (defmethod (dump database) (filename)
	(si:writing-bin-file (bin-stream filename)
	  (si:dump-attribute-list `(:package ,(package-name *package*) :mode :Lisp) bin-stream) 
	  (do-all-objects (object)
	    (si:dump-form-to-eval (creation-form object) bin-stream))
	  (do-all-objects (object)
	    (si:dump-form-to-eval (initialization-form object)))))

    DO-ALL-OBJECTS does what you might expect, namely map over the entire
    database, binding the given variable to each object in turn.

    The CREATION-FORM method returns something like

      `(setf (gethash *database* ',unique-id) (make-instance ',flavor ...))

    The INITIALIZATION-FORM method returns something like:

      `(let ((object (gethash *database* ',unique-id))
	     (prev (gethash *database* ',(object-unique-id previous-object)))
	     (next (gethash *database* ',(object-unique-id next-object)))
	     (other ',(object-unique-id other-object))
	     ...)		; Any other object references
	 (setf (object-previous-object object) prev
	       (object-next-object object) next
	       (object-other-object object) other
	       ... 
	       (object-option-thing1 object) ',thing1
	       ...))

    although in the interests of efficiency you could package this up into a
    function.  It's also a good idea not to put in the QUOTE if you can avoid it,
    since this can make your BIN file much larger.  

Really, "much larger"?  I'm surprised that it's that significant.

							I.e., if your unique IDs are
    self-evaluating, or some of the other THING references are self-evaluating,
    just put in ,THING instead of ',THING since the former takes fewer bytes in
    the binary representation.

OK, I see.  Thanks for the tip!

--------------------------------------------------------------------------------
    Date: Fri, 03 Nov 89 16:45:58 -0500
    From: kanderso@DINO.BBN.COM

      From: "RDP%ALAN.LAAC-AI.Dialnet.Symbolics.COM %ALAN.kahuna.DECNET.LOCKHEED.COM"@warbucks.ai.sri.com
      Date: Fri, 3 Nov 89 11:22:17 CST
  
      Many thanks for the replies so far (keep those cards and letters
      coming!).  I'm pursuing one additional item on this topic and it
      involves dumping the data in two passes.  The problem is that
      SYS:DUMP-FORMS-TO-FILE takes a filename rather than a stream and it
      doesn't allow me to say "append".  
  
      Anyone have an idea how best to solve this?  Maybe write two separate
      files with SYS:DUMP-FORMS-TO-FILE and then append them after the fact?
      Is it easy to append two binary files (or are there subtleties of the
      file format that will screw me)?  Or if I had the equivalent of
      SYS:DUMP-FORMS-TO-STREAM I guess I would be home free.  Then, I could
      simply do my own WITH-OPEN-FILE once and call SYS:DUMP-FORMS-TO-STREAM
      twice on it.
  
      Thanks again for any insight on this topic.
  
    This is part of dumping code we use for PCL objects, which are flavor
    instances.  It can be easily modified to work with normal flavors.
    The :FASD-FORM simply returns a form that makes an instance, and
    appends additional dump instructions onto *after-fasd-forms*.  Since
    the instances exist when the *after-fasd-forms* are dumped, there is
    no cicrularity problem.

    It would be better if rather than a full make-instance (making and
    object and initializing it), one just allocated space it.

    (defvar *after-fasd-forms* nil
      "Forms to dump after instances are made.")

Yes, building up such a list to do "object reference fixup" after the
:FASD-FORMs had run is just what I had in mind. 

    ;;; This is a version of SI:DUMP-FORMS-TO-FILE that lets you send messages to
    ;;; instance after you create them, so you can avoid circular references.
    (defun DUMP-FORMS-TO-FILE (filename forms &optional file-attribute-list)
      ;; If no package is specified for the file, use the USER package.
      (let ((*after-fasd-forms* nil))
	(unless (zl:get (scl:locf file-attribute-list) ':package)
	  (zl:putprop (scl:locf file-attribute-list) ':user ':package))
	(si:writing-bin-file (stream filename)
	  (si:dump-attribute-list file-attribute-list stream)
	  (dolist (form forms)
	    (si:dump-form-to-eval form stream)
	    (loop
	      (if *after-fasd-forms*
		  (si:dump-form-to-eval (pop *after-fasd-forms*) stream)
		  (return nil)))))))

    (flavor:defmethod (:FASD-FORM PCL:IWMC-CLASS) ()
      ;; We use the fact that instance with metaclass class are considered flavors by LISPM
      ;; So we just provide a :fasd-form that works with DUMP-FORMS-TO-FILE
      (setq *after-fasd-forms*
	    (nconc `((load-slots ',scl:self ,@(dump-items scl:self))
		     ,@(after-dump-forms scl:self)
		     (after-load ',scl:self))))
      `(pcl:make-instance ',(class-name (class-of scl:self))))

    (defgeneric DUMP-ITEMS (object)
      (:method-combination append)
      (:documentation "Returns a list of slot-name slot-value pairs.
    these are used to restore a new instance of the object.  Slot values may
    refer to the object."))

Well, I see that each of the examples gives me enough raw material to
accomplish this task.  My final concern is that both RSL and KANDERSO
use SI:WRITING-BIN-FILE, SI:DUMP-ATTRIBUTE-LIST, and
SI:DUMP-FORM-TO-EVAL.  I always prefer to avoid using undocumented
functions when possible.  In this case maybe I can't.  Any opinions on
how safe these are to build on?  For example, will I still be safe in
Release 8.0?  Hopefully, in the longer term I can rewrite this stuff
using the ANSI Common Lisp MAKE-LOAD-FORM that Moon mentioned.