[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Issue: SETF-SUB-METHODS (Version 4)



Issue: 		SETF-SUB-METHODS

References: 	CLtL pp. 95-96, 99, 105-106, 166
		Issue: PUSH-EVALUATION-ORDER

Category: 	CLARIFICATION

Edit history:  Version 1: JonL White & Ken D. Olum 12-Feb-88
		(based on problem originally called SETF-METHOD-FOR-SYMBOLS)
               Version 2: JonL White 23-May-88 (fix references and spellings).
               Version 3: JonL White 25-May-88 
                (Incorporate suggestions of moon:
                    Add explicit details for CHAR-BIT and MASK-FIELD.
                    Alter proposed semantics for setf-of-getf to accord with
                      current practice, rather than as suggested by original
                      correspondent; add defense for this interpretaton.
		 Delete references to the misleading, retracted proposal 
			[SETF-METHOD-FOR-SYMBOLS:TEMPORARY-VARIABLE]
                 Retracted misleading references about DELETE.
                 Clarify and unify overall presentation.)
               Version 4: JonL White & Ken D. Olum 26-May-88 (final insights!)



Problem description:


Tim Daly of IBM noticed an anomaly in that several implementations do not 
appear to observe left-to-right order of evaluation in the following form:

     (setq r '(a 1 b 2 c 3))
     (setq s r)
     (setf (getf r 'b) (progn (setq r nil) 6))

In these implementations, the side-effect of the setq appears to happen before
the evaluation of the place form (getf r 'b) which is necessary to fetch the 
list being updated.   A typical result is 'r' = (B 6), 's' = (A 1 B 2 C 3)
after the operation.  

There is a similar problem with SETF's over LDB, MASK-FIELD, and CHAR-BIT.

It is not necessary to clarify that left-to-right order of evaluation
semantics should be observed; CLtL p99 is quite explicit about that.
Rather, a distinction needs to be made between the computations involved
in "evaluation" of the subforms, and other computations that are implicit
in the notion of "doing an access" or "doing a store".

There are implications in this issue for the "read-modify-write" macros
such as INCF, DECF, PUSH, POP, and REMF, as well as for PSETF, SHIFTF, and
ROTATEF.  However, a good underpinning for SETF over the basic accessors
is all that is necessary in order to assure correct treatement for the
more compound forms.

The issue PUSH-EVALUATION-ORDER is a clarification about just the point
of the evaluation order for the various subforms to a PUSH;  thus there
is a similarity to this issue, but the present issue has a much deeper
problem because of the need to call GET-SETF-METHOD during setf macro
expansion.




Proposal:	SETF-SUB-METHODS:DELAYED-ACCESS-STORES


We propose to elaborate the documentation of SETF, specifically in the case 
of access forms whose sub-forms are permitted to be generalized variable 
references [and which thus need to call GET-SETF-METHOD during setf macro 
expansion] in the ways described below.

Here are the specific steps for interpreting a SETF of an access function 
which itself admits a generalized variable as an argument.  Remember, first,
that GET-SETF-METHOD returns the following:

   -- Some temporary variables
   -- A list of value-producing forms
   -- A list of store-variables (normally one).
   -- A storing form.
   -- An accessing form.

The code produced as the macro expansion of such a SETF form must essentially
do the following major steps:

  ** It evaluates the value-producing sub-forms, in left-to-right order, and 
     binds the temporary variables to them.  This will be called "binding the 
     temporaries."

  ** It "reads the value" from the generalized variable using the supplied 
     accessing form, to get the "old value";  this will be called "doing the
     access."  [Note that this is done after all the evaluations of the 
     preceeding step, including any side-effects they may have.]

  ** It binds the store-variable to a new value, and then installs this
     new value into the generalized variable using the supplied "storing 
     form".   This will be called "doing the store."

In particular, we must emphasize the fact that "reading the value" of a 
generalized variable reference is not part of the series of evaluations 
that must be done in left-to-right order.  Compare this with the statement 
of CLtL p99, third from last paragraph, "... the generalized variables are
both read and written in the same reference.  Preserving the source
program order of evaluation and the number of evaluations is equally 
important." 

Note carefully that in the (degenerate?) case when a generalized variable 
is in fact simply a program variable, then there are no sub-forms to be
considered "value-producing" forms; in fact, "doing the access" for such
a generalized variable (i.e. a program variable) is functionally the
same as evaluating it.  This explains why the behaviour that Tim Daly 
observed is at first perplexing -- the "do the access" step has the same
semantics as an evaluation step, even though it is done after all the
prescribed evaluations.



The place-specifier forms listed at the top of CLtL p96 permit admit (other)
place-specifiers as arguments; during the SETF expansion of these forms, it 
is necessary to call GET-SETF-METHOD in order to figure out how the inner, 
nested generalized variable must be treated.   We require that GETF be 
listed among these forms, since it must have a sub-recursive <place> specifier 
[however, there is no Common Lisp function serving as a pseudo-update function
for it, the way DPB serves for LDB].  See CLtL at the bottom of p95 and the 
top of p96.

For each place-specifier form with a sub-recursive place specifier, we will 
now describe how the information from GET-SETF-METHOD is used.


  CHAR-BIT:

    In a form such as:

        (SETF (CHAR-BIT <place-form> <bit-name>) <value-form>)

    the place referred to by the <place-form> must always be both accessed 
    and updated; note that the update is to the generalized variable 
    specified by <place-form> -- not to a character object itself.

    Thus this SETF should generate code to do the following:

    1. Bind the temporaries for <place-form>
    2. Evaluate <bit-name> (and bind into a temporary)
    3. Evaluate <value-form> (and bind into the store variable)
    4. Do the access to <place-form>
    5. Do the store into <place-form>, with the given bit-name of the 
       character fetched in step 4 changed to reflect the value from step 3.

    If the evaluation of <value-form> in step 3 alters what is found in the 
    given "place" -- such as setting a different "bit" of the character --
    then the change of the bit denoted by <bit-name> will be to that altered
    character, because the "access" step is done after the <value-form>
    evaluation.  See example 1 in the test cases section.  Nevertheless, the 
    evaluations required for binding the temporaries are done in steps 1 and 
    2, and thus the expected left-to-right evaluation order is seen.


  LDB: 

    In a form such as:

        (SETF (LDB <byte-spec> <place-form>) <value-form>)

    the place referred to by the <place-form> must always be both accessed 
    and updated;  note that the update is to the generalized variable 
    specified by <place-form> -- not to any object of type integer.

    Thus this SETF should generate code to do the following:

    1. Evaluate <byte-spec> (and bind into a temporary)
    2. Bind the temporaries for <place-form>
    3. Evaluate <value-form>  (and bind into the store variable)
    4. Do the access to <place-form>
    5. Do the store into <place-form> with the given bit-field of the integer
       fetched in step 4 replaced with the value from step 3.

    If the evaluation of <value-form> in step 3 alters what is found in the 
    given "place" -- such as setting a different bit-field of the integer --
    then the change of the bit-field denoted by <byte-spec> will be to that 
    altered integer, because the "access" step is done after the <value-form>
    evaluation.  See example 2 in the test cases section.  Nevertheless, the 
    evaluations required for binding the temporaries are done in steps 1 and 
    2, and thus the expected left-to-right evaluation order is seen.


  MASK-FIELD:

   This case is the same as LDB in all essential aspects.


  GETF:

    In a form such as:

        (SETF (GETF <place-form> <ind-form>) <value-form>)

    the place referred to by the <place-form> must always be both accessed 
    and updated;  note that the update is to the generalized variable 
    specified by <place-form> -- not necessarily to the particular list
    which is the property list in question.

    Thus this SETF should generate code to do the following:

    1. Bind the temporaries for <place-form> 
    2. Evaluate <ind-form> (and bind into a temporary)
    3. Evaluate the <value-form> (and bind into the store variable)
    4. Do the access to <place-form>
    5. Do the store into <place-form> with a possibly-new property list
       obtained by combining the values from steps 2, 3, and 4.  

    If the evaluation of <value-form> in step 3 alters what is found in the 
    given "place" -- such as setting a different named property in the list,
    then the change of the property denoted by <ind-form> will be to that 
    altered list, because the "access" step is done after the <value-form>
    evaluation.  See example 7 in the test cases section.  Nevertheless, the 
    evaluations required for binding the temporaries are done in steps 1 and 
    2,  and thus the expected left-to-right evaluation order is seen.

    Note that this phrase "possibly-new property list" treats the 
    implementation of property lists as a "black box"  -- it can mean that 
    the former property list is somehow destructively re-used, or it can 
    mean partial or full copying of it.  This is like the question of REMOVE
    or DELETE -- do you copy or do you destructively alter.  Since the answer
    could go either way, the treatment of the resultant value for the 
    "possibly-new property list" must proceed as if it were a different copy
    needing to be stored back into the generalized variable.



Test Cases:


  1. (setq char (make-char #\A 1))         ==>  #\Control-A
     (rotatef (char-bit char :control) 
              (char-bit char :meta))       ==>  <dont care>
     char  ==>  #\Meta-A
     ;; It's as if you start with #\Control-A, and then first turn the
     ;;  :control bit off, because the :meta bit was originally off; and
     ;;  then to the resulting #\A,  you add the :meta bit since the
     ;;  :control bit was originally on.

     Note, however, that if an implementation doesn't support both of these
     character 'bits', then this test case would have to be re-written to
     reference two independent bits actually supported.  If an implementation
     supports fewer than two independent character bits, then this test case
     is entirely moot.

  2. (setq integer #x69)                   ==>  #x69
     (rotatef (ldb (byte 4 4) integer) 
              (ldb (byte 4 0) integer))    ==>  <dont care>
     integer  ==>  #x96
     ;; This very-realistic example is simply trying to swap two
     ;;  independent bit fields in an integer.  Note that the generalized
     ;;  variable of interest here is just the (possibly local) program
     ;;  variable 'integer'.

  3a.(setq l1 (setq l2 (list #x69)))                ==>  (#x69)
     (setf (ldb (byte 4 4) (car l1))
	   (ldb (byte 4 0) (car (prog1 l1 
                                  (setq l1 nil))))) ==>  <dont care>
     l1 ==> nil
     l2 ==> (#x99)
     ;; Note that the (setq l1 nil) didn't affect the actions of the setf
     ;;  at all, since l1 was evaluated and its value was saved away in a
     ;;  temporary variable as part of the step "2. Bind the temporaries 
     ;;  for <place-form>", and this was done before the evaluation of the
     ;;  <value-form> which contains the (setq l1 nil).  Note also that the
     ;;  step "4. Do the access to <place-form>" means fetching the CAR of
     ;;  the saved (temporary) value of 'l1'; it does not mean doing a LDB
     ;;  on anything like that.


  3b.(setq l1 (setq l2 (list #x69)))                ==>  (#x69)
     (setf (ldb (byte 4 4) (car l1))
	   (ldb (byte 4 0) (car (rplaca l1 #x17)))) ==>  <dont care>
     l1 ==> (#x77)
     l2 ==> (#x77)
     ;; Note that the (rplaca l1 #x17) altered the contents of what l1
     ;;  was pointing to.  Thus even though l1 was evaluated and its  
     ;;  value was saved away in a temporary variable as part of the step 
     ;;  "2. Bind the temporaries for <place-form>", and even though this 
     ;;  was done before the evaluation of the <value-form> which contains 
     ;;  the rplaca, still the side-effect changes things because it alters
     ;;  what will be fetched during the "do the access" step.

  4. (setq integer #x69)
     (setf (mask-field (byte 4 4) integer) (incf integer)) => #x6A
     integer ==> #x6A

  5. (setq integer #x6A)
     (setf (mask-field (byte 4 4) integer) (ash (incf integer) 4)) => #x6B0
     integer => #xBB

  6. (setq s (setq r (list 'a 1 'b 2 'c 3)))         ==>  (a 1 b 2 c 3)
     (setf (getf r 'b) 
           (progn (setq r nil) 6))                   ==>  6
     r ==> (b 6)
     s ==> (a 1 b 2 c 3)
     ;; Note that the generalized variable of concern here is the (degenerate?)
     ;;  one of simply the program variable 'r'; it is not a property-list 
     ;;  slot denoted by (getf r 'b).   At the time the step "4. Do the access
     ;;  to <place-form>" is performed, the evaluation of the <value-form>
     ;;  has already altered the generalized variable 'r', and thus a nil is
     ;;  returned for this access; that is why a fresh property-list (B 6) is
     ;;  created an stored back into 'r'.

  7. (setq s (setq r (list (list 'a 1 'b 2 'c 3))))  ==>  ((a 1 b 2 c 3))
     (setf (getf (car r) 'b) 
           (progn (setq r nil) 6))                   ==>  6
     r ==> nil
     s ==> ((A 1 B 6 C 3))
     ;; Note that the (setq r nil) does not affect the actions of the setf 
     ;;  because the value of R had already been saved in a temporary variable
     ;;  as part of the step "1. Bind the temporaries for <place-form>".  Only
     ;;  the CAR of this value will be accessed, and subsequently modified 
     ;;  after the value computation.



Rationale:


As a principle,

    (setf (foo-a x) (progn (setf (foo-b x) ...)
                           new-a-value))

should always set both of the "slots" of 'x', even if these slots are 
implemented as bit-fields, getf-properties, and so on.  Only by separating 
out evaluation from "generalized variable access", and by specifying that
the access is done after all the evaluations, can this correctly be done.



Current Practice:

Symbolics and Lucid already operate pretty much according to this proposal.



Cost to Implementors:

Several implementations would require several hours of programmer and
documentation time.



Cost to Users:

None; this is a clarification of a point not sufficiently elaborated in CLtL.



Cost of non-adoption:

Users will think SETF is unnatural in that left-to-right order of
evaluation isn't well specified.



Benefits:

Uniform semantics and portability in the face of recursive "place specifiers"
for SETF.  Setf of (LDB ... <place>) and of (GETF <place> ...) will behave
uniformly no matter the nature of the <place>.

Anyone who has copied examples from p105 and p106 will continue to
be able to use them.



Esthetics:

See "Cost of non-adoption"



Discussion:


In the detailed descriptions for each access form, the phrase
    "the place referred to by the <place-form> must always be both 
     accessed and updated; note that the update is to the generalized 
     variable specified by <place-form>"
is not intended to prevent optimizations that could occur when the
code "knows" that the new value will be EQ to the old one.  The only
requirements is that the results be semantically equivalent.

There is an interesting parallel between this case for GETF and the
very common mistake made by Lisp programmers with respect to the 
function DELETE.  How often the complaint is filed that DELETE didn't
do anything when it should have; but in fact the filer simply forgot
that delete, although permitted to destructively modify the original
list, may also return some tail of the list originally give it, 
whether or not an alteration occurs.

      (setq l '(a a b c d)) ==> (a a b c d)
      (delete 'a l)         ==> (b c d)
      l 		    ==> (a a b c d)

The unwary user thinks that because 'l' is still EQ to the original value 
that "DELETE didn't do anything".  The temptation to ignore the resultant 
value of DELETE parallels the temptation to forget about a need to perform
a final update to <place-form> in the setf-of-getf case.