[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Issue: REMF-DESTRUCTION-UNSPECIFIED (Version 4)



Well, I've caved in to the EXPLICITLY-VAGUE camp due to changes in my
thinking over the past year on where it is appropriate to trade off
efficiency and portability. Anyway, guided by my newfound wisdom and
DLA's persistent view, I've stripped out my counter-proposal and
rearranged things a bit to make this neat and pretty. Hope this new
presentation doesn't raise too much uproar...
 -kmp

-----
Issue:        REMF-DESTRUCTION-UNSPECIFIED
References:   (SETF (GET ...) ...), REMPROP, (SETF (GETF ...) ...),
	      REMF (pp165-167); NREVERSE (p248); DELETE, DELETE-IF,
	      DELETE-IF-NOT, DELETE-DUPLICATES, NSUBSTITUTE, 
	      NSUBSTITUTE-IF, NSUBSTITUTE-IF-NOT (pp254-256); NCONC,
	      NRECONC (p269); NBUTLAST (p271); NSUBST, NSUBST-IF,
	      NSUBST-IF-NOT (p274); NUNION, NINTERSECTION,
	      NSET-EXCLUSIVE-OR (pp276-279).
Category:     CLARIFICATION/CHANGE
Edit history: 11-Feb-87, Version 1 by Dave Andre (DLA@Symbolics.COM)
	      29-Oct-87, Version 2 by Pitman (flesh out proposals)
	      28-Nov-88, Version 3 by Pitman (revised presentation)
	      29-Nov-88, Version 4 by Pitman (slight editing per DLA)
Status:	      For Internal Discussion

Problem Description:

 Currently, the exact nature of the side-effect performed by list
 modification routines is not specified.

 Either the specific modifications allowed should be specified so that
 programmers can rely on them and implementors can avoid accidentally
 causing problems by introducing well-meaning optimizations, or else
 the documentation should explicitly state that the effects are
 unspecified so that programmers will not depend on them and 
 implementors will feel comfortable about doing interesting optimizations.

Proposal (REMF-DESTRUCTION-UNSPECIFIED:EXPLICITLY-VAGUE):

 Clarify that the way in which the destructive behavior of the
 operators below is achieved is explicitly vague in a number of ways,
 in order to provide individual implementations the necessary
 flexibility to do useful optimizations.

 (SETF (GETF place indicator) value)
  is permitted to either SETF place or to SETF any part, CAR or
  CDR, of the top-level list structure held by that place.

 (REMF place indicator)
  is permitted to either SETF place or to SETF any part, CAR or
  CDR, of the top-level list structure held by that place.

 (SETF (GET symbol indicator) value)
  is constrained to behave exactly the same as
  (SETF (GETF (SYMBOL-PLIST symbol) indicator) value).

 (REMPROP symbol indicator)
  is constrained to behave exactly the same as
  (REMF (SYMBOL-PLIST symbol) indicator).

 (NREVERSE sequence)
  when sequence is a list, is permitted to SETF any part, CAR or
   CDR, of the top-level list structure in that sequence.
  when sequence is an array is permitted to re-order the elements
   of the given sequence in order to produce the resulting array.

 (DELETE object sequence ...)
  when sequence is a list, is permitted to SETF any part, CAR or
   CDR, of the top-level list structure held in that sequence.
  when sequence is an array is permitted to change the dimensions
   of the array and to slide its elements into new positions without
   permuting them to produce the resulting array.
  
 (DELETE-IF test sequence ...)
  is constrained to behave exactly like
  (DELETE NIL sequence
	  :TEST #'(LAMBDA (IGNORE ITEM) (FUNCALL test ITEM))
	  ...).

 (DELETE-IF-NOT test sequence ...)
  is constrained to behave exactly like
  (DELETE NIL sequence
	  :TEST #'(LAMBDA (IGNORE ITEM) (NOT (FUNCALL test ITEM)))
	  ...).

 (DELETE-DUPLICATES sequence ...)
  when sequence is a list, is permitted to SETF any part, CAR or
   CDR, of the top-level list structure held in that sequence.
  when sequence is an array is permitted to change the dimensions
   of the array and to slide its elements into new positions without
   permuting them to produce the resulting array.

 (NSUBSTITUTE new-object old-object sequence ...)
 (NSUBSTITUTE-IF new-object test sequence ...)
 (NSUBSTITUTE-IF-NOT new-object test sequence ...)
  when sequence is a list, is permitted to SETF any part, CAR or
   CDR, of the top-level list structure in that sequence.
  when sequence is an array is permitted to SETF the contents of
   any cell in that array which must be replaced by NEW-OBJECT.

  Note, however, that since this side-effect is not required,
  these functions should still not be used in for-effect-only
  positions in portable code.

 (NCONC . lists)
  is permitted to SETF any part, CAR or CDR, of the top-level of
  any of the given lists (except the last, which is not required to
  be a list and must not be modified).

 (NRECONC list tail)
  is constrained to have side-effect behavior equivalent to:
  (NCONC (NREVERSE list) tail).

 (NBUTLAST list ...)
  is permitted to SETF any part, CAR or CDR, of the top-level of
  any of the given list.

 (NSUBST new-object old-object tree)
 (NSUBST-IF new-object test tree) 
 (NSUBST-IF-NOT new-object test tree)
  is permitted to SETF any part of the TREE of conses which must
  be replaced by NEW-OBJECT.

  Note, however, that since the tree might be a degenerate tree
  containing no conses and since the side-effect is not required,
  these functions should still not be used in for-effect-only
  positions in portable code.

 (NUNION list1 list2 ...)
 (NINTERSECTION list1 list2 ...)
 (NSET-EXCLUSIVE-OR list1 list2 ...)
  is permitted to SETF any part, CAR or CDR, of the top-level of
  any of the given lists.

  Note, however, that since this side-effect is not required,
  these functions should still not be used in for-effect-only
  positions in portable code.

 Note: The above clarifications are not intended as complete functional
 descriptions. They are intended to augment (rather than to replace)
 other descriptions already in effect.

Test Cases:

 For GETF...

    (SETQ FOO (LIST 'A 'B 'C 'D 'E 'F))    ==> (A B C D E F)
    (SETQ BAR (CDDR FOO))                  ==> (C D E F)
    (REMF FOO 'C)
    BAR				           ==> ??

    In Symbolics Common Lisp, BAR holds (C D E F).
    CLtL allows other interpretations. eg, BAR might hold
    (C), (NIL), (C NIL) or (C D).
    Under this proposal, any of these interpretations (and others as well)
    would still be valid

 For DELETE...

    (SETQ FOO (LIST 'A 'B 'C))   ==> (A B C)
    (SETQ BAR (CDR FOO))         ==> (B C)
    (SETQ FOO (DELETE 'B FOO))   ==> (A C)
    BAR                          ==> ??
    (EQ (CDR FOO) (CAR BAR))     ==> ??

    In Symbolics Common Lisp, these last two expressions return ((C)) and T.
    Under this proposal, either of these interpretations (and others
    as well) would be valid.

Rationale:

 Implementations already vary widely on their implementation techniques
 for these functions. This effectively clarifies the status quo, making
 it more clear to programmers what they may rely upon in portable code.

Current Practice:

 All valid implementations are believed to comply.

Cost to Implementors:

 None. This is the status quo for implementors.

Cost to Users:

 This change would not affect programs coded with "good programming
 practice".  That is, only programs which rely on currently undocumented
 features would be in any danger of breaking.  In fact, those programs
 are already in such danger, and this change to the documentation would
 just publicize it.  The clarification would -encourage- good programming
 practice by warning people to only obey the published contract of the
 above-mentioned functions.

 There is, however, no automatic technique for making this check for
 programs already in error. Bugs due to unexpected side-effects are in
 general among the hardest to reckon with.

Cost of Non-Adoption:

 Programmers may naively believe there is only one possible or reasonable
 implementation of these functions. Some implementors may shy away from
 reasonable optimizations out of a paranoid belief that deviating from 
 some vague, unspoken rules will lead to programmer unrest. Making these
 things explicitly vague clarifies the implementor's rights in a way that
 permits numerous useful optimizations.

Benefits: 

 Users would be discouraged from taking advantage of subtle details
 of these destructive operations because such details would be explicitly
 not guaranteed to be portable.

 Implementations can improve performance of many of the above-mentioned
 functions when they are not under the constraint to implement them
 in a highly constrained fashion. For example, in Symbolics systems,
 DELETE of a cdr-coded list could use the implementation primitive
 %CHANGE-LIST-TO-CONS rather than RPLACD to avoid creating forwarding
 pointers.

 Garbage collection effectiveness can also be improved. For example,
 all of the destructive operations which remove objects (eg, REMF)
 could remove CAR pointers to removed objects which are more volatile
 than the list itself, assisting the garbage collector and thereby
 improving memory usage and paging performance.

Non-Benefits:

 Users who inadvertently depend on side-effect behavior may be rudely
 surprised when moving between implementations.

 Compatibility with older Lisp dialects is diminished.

Performance Impact:

 Metering in Symbolics test  systems have shown that there are substantial
 performance gains to be had by allowing implementations flexibility in
 these areas.

Aesthetics:

 Most of these functions implement abstract operations. For example,
 REMPROP implements an abstract operation on a "property list".
 Proper language design should not encourage people to delve below the
 level of abstraction into the nitty gritty.

Discussion:

 Andre's original version of this proposal pushed for explicitly vague
 descriptions of these functions' side-effect behavior.  He believes
 that if users want more predictability from these functions, they
 should write private variants that implement whatever predictability
 they require. 

 Pitman originally opposed this position because he weighed portability
 a higher concern. Since the original discussion, however, his views on
 how to resolve this priority have been refined, and he now believes
 that leaving things vague is appropriate. As such, he now supports what
 is effectively Andre's original proposal.

 Pitman and Andre support this proposal.