[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Issue: SETF-SUB-METHODS



Ken Olum and I [JonL White] have given some thought to the problem that 
spurred the issue SETF-METHOD-FOR-SYMBOLS, and feel that the issue as 
stated attacks a small manifestation of the problem rather than the root
underlying case.  Worse yet, the TEMPORARY-VARIABLE proposal requires a 
change which creates a bug in another area of SETF usage [see below].

We propose that the problem lies in understanding of how SETF works when 
the SETF expansion for a form like 
	(<access-fun> <place> ...) 
involves the sub-recursive SETF expansion of <place>, or in other words,
when the accessing technique isn't fully specified just by looking at 
the get-setf-method of <access-fun>.   The class of such <access-fun>'s 
is enumerated in CLtL, at the top of p96, except that GETF is missing 
 from that list [we propose that it be added there in the next manual].

This message will propose a clarification the issues of left-to-right
evaluation in such cases, and will also detail the actions that should
be taken by SETF methods on each of these place-types.

!

ISSUE: 		SETF-SUB-METHODS

References: 	CLtL pp. 95-96, 99, 105-106, 166
		Issue: PUSH-EVALUATION-ORDER
		Issue: SETF-METHOD-FOR-SYMBOLS

Category: 	Clarification

Edit history:  Version 1 JonL & KDO 12-Feb-88


Problem description:

Tim Daly of IBM noticed that several implementations do not observe
left-to-right order of evaluation in the following form:

     (setq r '(a 1 b 2 c 3))
     (setq s r)
     (setf (getf r 'b) (progn (setq r nil) 6))

In these implementations, the side-effect of the setq happens before the 
evaluation of the "subforms" necessary to fetch the list being updated. 
A typical result is that 'r' = (B 6), 's' = (A 1 B 2 C 3) after the 
operation.  Surely, it should at least be the case that 's' = (A 1 B 6 C 3), 
since the expectation is that the B property in this list is being 
(destructively) changed.

Similar problems exist with LDB and CHAR-BIT.

It is not necessary to clarify that left-to-right order of evaluation
semantics should be observed; CLtL p99 is quite explicit about that.
Rather, a distinction needs to be made between the computations involved
in "evaluation" of the subforms, and other computations that are implicit
in the notion of "doing an access" or "doing a store".

Proposal:	SETF-SUB-METHODS:DELAYED-ACCESS-STORES

Elaborate the documentation of SETF, especially in the case of access
forms whose sub-forms are permitted to be generalized variable references
[and thus which need to call GET-SETF-METHOD during setf macro expansion].

We remember that GET-SETF-METHOD returns the following:

  Some temporary variables
  A list of value-producing forms
  A list of store-variables (normally one).
  A storing form.
  An accessing form.

Then we want to say that:

  Code generated using GET-SETF-METHOD does 2 or 3 of the following
  things:

  It evaluates the value-producting forms, and binds the temporary
  variables to them.  This will be called "binding the temporaries."

  It evaluates the accessing form to get the old value of the place.
  This will be called "doing the access."

  It binds the store-variable to a new value and calls the storing form
  to install it in the place.  This will be called "doing the store."

  The forms listed at the top of CLtL p96 permit recursive <place>
  specifiers; for each one, we will describe how the sub-recursive 
  information from GET-SETF-METHOD is used.

  LDB: 

    In a form such as

    (SETF (LDB <byte-spec> <place-form>) <value-form>)

    the place referred to by the <place-form> must always be both
    accessed and updated.  Even if the <place-form> refers to a
    bignum, the bignum itself will not be modified but rather a new
    bignum stored in the <place-form>.  Thus this SETF should generate
    code to do the following:

    1. Evaluate <byte-spec>
    2. Bind the temporaries for <place-form>
    3. Evaluate <value-form>
    4. Do the access to <place-form>
    5. Do the store into <place-form> with the given bit-field
    replaced with the value from step 3.

    If the evaluation of <value-form> in step 3 sets a different
    bit-field in the given place then since the access is done later
    at step 4 this change will be preserved.  See ROTATEF example in
    discussion below.  Nevertheless the evaluations required for
    binding the temporaries are done in step 2, and thus the expected
    left-to-right evaluation order is seen.

  GETF:

    The case of GETF is complicated by the fact that two different
    "place" locators are involved: one to use if the specified
    indicator is present in the property list, and another if not.
    For example, in (SETF (GETF (AREF ... I) 'B) 6), if the I'th slot
    of the array is NIL, then that slot must be changed, but if it
    contains a list with the property B then only the cons cell with
    that property value needs to be changed.  This decision cannot be
    made at macro-expansion time.  It depends entirely on the contents
    of the list in question, and so must be delayed until the last
    possible moment.

    In the first place, GETF should be listed among the other place
    forms that admit place forms as one of their arguments.  See CLtL
    at the bottom of p95 and the top of p96.

    More specifically, the expansion of 

    (SETF (GETF <place-form> <ind-form>) <value-form>)

    should generate code to do the following:

    1. Bind the temporaries for <place-form> 
    2. Do the access to <place-form>
       [Binding the temporaries and then doing the access is equivalent to 
       evaluating the <place-form>.]
    3. Evaluate <ind-form> [and save the result in a temporary variable].
    4. Check whether the value from 2 has the indicator from 3.

    If so:
	5A. Find cons-cell after indicator from above
	6A. Evaluate <value-form>
	7A. RPLACA cons-cell from 5A with value from 6A
	    [In this case, we do not do a store into <place-form>.  When 
	     the indicator is already present then the location specifed 
	    by <place-form> doesn't need a new value.]
    If not:
	5B. Evaluate <value-form>
	6B. Do the access to <place-form>, using the temporaries saved from 
	    step 1 [this is not another evaluation -- but it may involve
	    some non trivial computation, depending on how complex the
	    access method really is.].
	7B. Do the store into <place-form>, again using the temporaries saved
	    from step 1, setting it to a list with indicator from 3, new value
	    from 5B, and old list from 6B.


Test Cases:
      (setq integer #x69)
      (rotatef (ldb (byte 4 4) integer) (ldb (byte 4 0) integer))


Rationale:

 As a principle,

    (setf (foo-a x) (setf (foo-b x) ...))

 should always set both of the "slots" of X, even if these slots are 
 implemented as bit-fields, getf-properties, and so on.

 However,

    (setf (foo-a x) (setf (foo-a x) ...)) 

 is an error [that is, what happens is undefined.]

Current Practice:

 -- Xerox and Franz already operate on GETF according to this perscription.
 -- Symbolics and Lucid differ by always doing a setf on the variable
    rather than updating the appropriate cons cell of the property list; 
    additionally, they fail to connect the new value put into 'r' to the
    original property list which was 'r's value initially.
 -- HP and VAX  Common Lisps update the cons cell, but then set the variable
    'r' again, nullifying the effect of the "(setq r nil)" in the <value-form>
    computation.

Adoption cost:

 Several implementations would require several hours of programmer and
 documentation time.

Cost of non-adoption:

 Users will think SETF is unnatural in that left-to-right order of
 evaluation isn't always observed.

Benefits:

 Uniform semantics and portability in the face of recursive "place specifiers"
 for SETF.  Setf of (LDB ... <place>) and of (GETF <place> ...) will behave
 uniformly no matter the nature of the <place>.

 Anyone who has copied examples from p105 and p106 will continue to
 be able to use them.

Conversion Cost:

 This is a clarification of a point not sufficiently elaborated in CLtL.
 Symbolics and Lucid are already modifying their interpretation of this
 code -- in some way or other.

Esthetics:

 See "Cost of non-adoption"


Discussion:

 In the case that spurred this discussion,

     (setq r '(a 1 b 2 c 3))
     (setq s r)
     (setf (getf r 'b) (progn (setq r nil) 6))

 the consequent update is a RPLACA to some list cell -- not a setf of
 the variable 'r' -- and thus 'r' should be NIL and 's' should now be
 (A 1 B 6 C 3).

 There is an interesting parallel between this case for GETF and the
 very common mistake made by Lisp programmers with respect to the 
 function DELETE.  How often the complaint is filed that DELETE didn't
 do anything when it should have; but in fact the filer simply forgot
 that delete can either update some CAR slot of the list originally 
 given it, or return some tail of the list originally give it, but
 not both!  E.g.

      (setq l '(a a b c d)) ==> (a a b c d)
      (delete 'a l)         ==> (b c d)
      l 		      ==> (a a b c d)

 The unwary user thinks that because 'l' is still eq to the original
 value that "DELETE didn't do anything".  The variability of action
 at runtime here parallels the variable choice of location specifier
 for (SETF (GETF ...) ...)

 A previous proposal to fix this problem was misdirected.  It was
 phrased as follows:

      Proposal: SETF-METHOD-FOR-SYMBOLS:TEMPORARY-VARIABLE

      Change the result of (get-setf-method 'foo) from
      NIL NIL (#:G1174) (SETQ FOO #:G1174) FOO
      to
      (#:G1175) (FOO) (#:G1174) (SETQ FOO #:G1174) #:G1175

 The problem with this is that it breaks a relatively simple client of 
 setf technology:
      (setq integer #x69)
      (rotatef (ldb (byte 4 4) integer) (ldb (byte 4 0) integer))
 no longer does the "right thing".  Using the prescription for setf
 methods on symbols from  CLtL p105, the result of this 'rotatef' is
 that integer = #x96; using that given by the TEMPORARY-VARIABLE
 proposal leads to either #x99 [or #x66 depending on which order
 the implementation chooses to do the actual stores in].

 In addition, if 'integer' is replaced by '(car x)' here, then the 
 behavior is different under the TEMPORARY-VARIABLE proposal because 
 that only applies to symbols.

 Implicitly, two "places" are being specified here to be updated; but
 in fact the two places are not independent -- they both involve setq'ing
 the variable 'integer'.  Furthermore, each store operation implicitly
 requires fetching the value from that place in order to combine it
 with DPB.  It is necessary to delay the accesses until the last moment
 before combining with DPB in order to see the side-effects of the
 earlier store operations.