[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

REMF-DESTRUCTION-UNSPECIFIED (Version 2)



There has been no discussion on DLA's REMF-DESTRUCTION-UNSPECIFIED
proposal.

I apologize for the length of this proposal; I propose that only half of
it reach X3J13. We just have to figure out which half.

In certain ways, my feeling is that potential implementation speed is up
against debuggability here.  There may be other ways of characterizing
the skew as well.  If you mostly agree with one half but have minor
problems, please vote for that half and then propose changes to smooth
things out once we have things cut down to size. I think we should get a
general sense for how people want to proceed globally before doing a lot
of nitpicking.

-----
Issue:        REMF-DESTRUCTION-UNSPECIFIED
References:   (SETF (GET ...) ...), REMPROP, (SETF (GETF ...) ...),
	      REMF (pp165-167); NREVERSE (p248); DELETE, DELETE-IF,
	      DELETE-IF-NOT, DELETE-DUPLICATES, NSUBSTITUTE, 
	      NSUBSTITUTE-IF, NSUBSTITUTE-IF-NOT (pp254-256); NCONC,
	      NRECONC (p269); NBUTLAST (p271); NSUBST, NSUBST-IF,
	      NSUBST-IF-NOT (p274); NUNION, NINTERSECTION,
	      NSET-EXCLUSIVE-OR (pp276-279).
Category:     CLARIFICATION
Edit history: 11-Feb-87, Version 1 by DLA@Stony-Brook.SCRC.Symbolics.COM,
	      29-Oct-87, Version 2 by Pitman (flesh out proposals)
Status:	      For Internal Discussion

Problem Description:

 Currently, the exact nature of the side-effect performed by list
 modification routines is not specified.

Test Cases:

 For GETF...

    (SETQ FOO (LIST 'A 'B 'C 'D 'E 'F))    ==> (A B C D E F)
    (SETQ BAR (CDDR FOO))                  ==> (C D E F)
    (REMF FOO 'C)
    BAR				           ==> ??

    In Symbolics Common Lisp, BAR holds (C D E F).
    CLtL allows other interpretations. eg, BAR might hold
    (C), (NIL), (C NIL) or (C D).
    In MAKE-EXPLICITLY-DEFINED, BAR would hold (C D E F).
    In MAKE-EXPLICITLY-VAGUE, any of these interpretations (and
    others as well) would still be valid

 For DELETE...

    (SETQ FOO (LIST 'A 'B 'C))   ==> (A B C)
    (SETQ BAR (CDR FOO))         ==> (B C)
    (SETQ FOO (DELETE 'B FOO))   ==> (A C)
    BAR                          ==> ??
    (EQ (CDR FOO) (CAR BAR))     ==> ??

    In Symbolics Common Lisp, these last two expressions return ((C)) and T.
    In MAKE-EXPLICITLY-DEFINED, they would return (B C) and NIL.
    In MAKE-EXPLICITLY-VAGUE, either of these interpretations (and others
    as well) would be valid.

Proposal (REMF-DESTRUCTION-UNSPECIFIED:MAKE-EXPLICITLY-DEFINED):

 Clarify that the destructive behavior of the following operators
 is restricted as indicated. Note well -- only the side-effect behavior
 of these functions is discussed here; these remarks are not intended
 as complete functional descriptions.

 (SETF (GETF place indicator) value)
  is permitted to either SETF place or to SETF the CADR of what
  (GET-PROPERTIES place (LIST indicator)) would return.

 (REMF place indicator)
  is permitted to either SETF place or to SETF the CDR of the
  part of the top-level list in place which points to what
  (GET-PROPERTIES place (LIST indicator)) would return.
  In particular, the two removed cells may not be destructively
  modified.

 (SETF (GET symbol indicator) value)
  is constrained to behave exactly the same as
  (SETF (GETF (SYMBOL-PLIST symbol) indicator) value).

 (REMPROP symbol indicator)
  is constrained to behave exactly the same as
  (REMF (SYMBOL-PLIST symbol) indicator).

 (NREVERSE sequence)
  when sequence is a list is permitted to setf the CDR of any
   subtail of sequence to point to any other subtail of sequence,
   to NIL, or to any piece of newly created list structure.
  when sequence is an array is permitted to re-order the elements
   of the given sequence in order to produce the resulting array.

 (DELETE object sequence ...)
  when sequence is a list is permitted to SETF the CDR of any part
   of that list which points to a tail whose car is object.
  when sequence is an array is permitted to change the dimensions
   of the array and to slide its elements into new positions without
   permuting them to produce the resulting array.
  
 (DELETE-IF test sequence ...)
  is constrained to behave exactly like
  (DELETE NIL sequence
	  :TEST #'(LAMBDA (IGNORE ITEM) (FUNCALL test ITEM))
	  ...).

 (DELETE-IF-NOT test sequence ...)
  is constrained to behave exactly like
  (DELETE NIL sequence
	  :TEST #'(LAMBDA (IGNORE ITEM) (NOT (FUNCALL test ITEM)))
	  ...).

 (DELETE-DUPLICATES sequence ...)
  when sequence is a list is permitted to SETF the CDR of any part
   of that list which points to a duplicated element that is to be
   removed.
  when sequence is an array is permitted to change the dimensions
   of the array and to slide its elements into new positions without
   permuting them to produce the resulting array.

 (NSUBSTITUTE new-object old-object sequence ...)
 (NSUBSTITUTE-IF new-object test sequence ...)
 (NSUBSTITUTE-IF-NOT new-object test sequence ...)
  when sequence is a list is permitted to SETF the CAR of any part
   of that list which must be replaced by NEW-OBJECT.
  when sequence is an array is permitted to SETF the contents of
   any cell in that array which must be replaced by NEW-OBJECT.

  Note, however, that since this side-effect is not required,
  these functions should still not be used in for-effect-only
  positions in portable code.

 (NCONC . lists)
  is permitted to SETF the CDR of the LAST of any of its argument
  lists which are non-null, except the last argument.

 (NRECONC list tail)
  is constrained to have side-effect behavior equivalent to:
  (NCONC (NREVERSE list) tail).

 (NBUTLAST list ...)
  is permitted to SETF the tail of its argument list whose CDR
  is LAST of that argument LIST.

 (NSUBST new-object old-object tree)
 (NSUBST-IF new-object test tree) 
 (NSUBST-IF-NOT new-object test tree)
  is permitted to SETF any part of the TREE of conses which must
  be replaced by NEW-OBJECT.

  Note, however, that since the tree might be a degenerate tree
  containing no conses and since the side-effect is not required,
  these functions should still not be used in for-effect-only
  positions in portable code.

 (NUNION list1 list2 ...)
 (NINTERSECTION list1 list2 ...)
 (NSET-EXCLUSIVE-OR list1 list2 ...)
  is permitted to setf the CDR of any subtail of either list to point
  to any other subtail of either list, to NIL, or to any piece of newly
  created list structure.

 Rationale:

  This implements what some users will consider the status quo.
  In most cases, it is consistent with what naive implementations of
  these functions might do.

 Benefits: 

  By being more explicit about the side-effects which can be expected,
  users can write programs which more reliably exploit these side-effects.
  In some case, this may make certain programming problems easier.

  Certain naive user assumptions and assumptions based on the behavior
  of older lisp dialects would be supported.

  Compatibility with older Lisp dialects (eg, Zetalisp, Maclisp, ...)
  would generally be better.

  Gratuitious porting hassles would be naturally lessened slightly.
  In situations where programmers inadvertently depended upon the specific
  nature of a particular operator's side-effect, chances would be much
  greater that the code would be portable because there would be less room
  for implementations to vary.

 Non-Benefits:

  Implementations may be forced to be less efficient than they could be
  if the MAKE-EXPLICITLY-VAGUE proposal were adopted, since that proposal
  allows implementors more room for optimization.  Some existing 
  implementations will be forced to become less efficient to meet the
  standard, degrading performance of existing applications which do not
  rely on the new features with no benefit to those existing applications.

 Adoption Cost:

  Some implementations would have to change. For example, Symbolics Common
  Lisp makes more unusual assumptions about what it can modify than some
  of the above rules allow.

 Conversion Cost:

  Probably none. Users must currently tread lightly since they really
  have no idea in many cases what kind of side-effect is really likely to
  occur when they use some of the existing destructive operators.
  Some implementations might be forced to change, and since some user
  programs might have depended on the old behavior, programs which used
  to run might be broken in the transition. However, in most cases where
  an implementation guarantees any behavior, it is likely to be the one
  suggested here. Systems which have other behaviors are likely to warn
  users not to depend on the specifics of those behaviors. So the incidence
  of problems arising in practice is likely to be very small, though
  probably not nonzero.

 Aesthetics:

  Some of these restrictions tend to expose more detail about the
  implementation of these operators, turning them from abstract operations
  into more concrete utilities. This is probably an issue that can be
  addressed by appropriate information hiding in a User's Manual however.

  No matter how unpleasant the presence of these somewhat concrete
  restrictions may seem, the porting bugs which arise in their absence are
  bound to seem more unpleasant to some users.

Proposal (REMF-DESTRUCTION-UNSPECIFIED:MAKE-EXPLICITLY-VAGUE):

 Clarify that the destructive behavior of the following operators
 is restricted as indicated:

 (SETF (GETF place indicator) value)
  is permitted to either SETF place or to SETF any part, CAR or
  CDR, of the top-level list structure held by that place.

 (REMF place indicator)
  is permitted to either SETF place or to SETF any part, CAR or
  CDR, of the top-level list structure held by that place.

 (SETF (GET symbol indicator) value)
  is constrained to behave exactly the same as
  (SETF (GETF (SYMBOL-PLIST symbol) indicator) value).

 (REMPROP symbol indicator)
  is constrained to behave exactly the same as
  (REMF (SYMBOL-PLIST symbol) indicator).

 (NREVERSE sequence)
  when sequence is a list, is permitted to SETF any part, CAR or
   CDR, of the top-level list structure in that sequence.
  when sequence is an array is permitted to re-order the elements
   of the given sequence in order to produce the resulting array.

 (DELETE object sequence ...)
  when sequence is a list, is permitted to SETF any part, CAR or
   CDR, of the top-level list structure held in that sequence.
  when sequence is an array is permitted to change the dimensions
   of the array and to slide its elements into new positions without
   permuting them to produce the resulting array.
  
 (DELETE-IF test sequence ...)
  is constrained to behave exactly like
  (DELETE NIL sequence
	  :TEST #'(LAMBDA (IGNORE ITEM) (FUNCALL test ITEM))
	  ...).

 (DELETE-IF-NOT test sequence ...)
  is constrained to behave exactly like
  (DELETE NIL sequence
	  :TEST #'(LAMBDA (IGNORE ITEM) (NOT (FUNCALL test ITEM)))
	  ...).

 (DELETE-DUPLICATES sequence ...)
  when sequence is a list, is permitted to SETF any part, CAR or
   CDR, of the top-level list structure held in that sequence.
  when sequence is an array is permitted to change the dimensions
   of the array and to slide its elements into new positions without
   permuting them to produce the resulting array.

 (NSUBSTITUTE new-object old-object sequence ...)
 (NSUBSTITUTE-IF new-object test sequence ...)
 (NSUBSTITUTE-IF-NOT new-object test sequence ...)
  when sequence is a list, is permitted to SETF any part, CAR or
   CDR, of the top-level list structure in that sequence.
  when sequence is an array is permitted to SETF the contents of
   any cell in that array which must be replaced by NEW-OBJECT.

  Note, however, that since this side-effect is not required,
  these functions should still not be used in for-effect-only
  positions in portable code.

 (NCONC . lists)
  is permitted to SETF any part, CAR or CDR, of the top-level of
  any of the given lists (except the last, which is not required to
  be a list and must not be modified).

 (NRECONC list tail)
  is constrained to have side-effect behavior equivalent to:
  (NCONC (NREVERSE list) tail).

 (NBUTLAST list ...)
  is permitted to SETF any part, CAR or CDR, of the top-level of
  any of the given list.

 (NSUBST new-object old-object tree)
 (NSUBST-IF new-object test tree) 
 (NSUBST-IF-NOT new-object test tree)
  is permitted to SETF any part of the TREE of conses which must
  be replaced by NEW-OBJECT.

  Note, however, that since the tree might be a degenerate tree
  containing no conses and since the side-effect is not required,
  these functions should still not be used in for-effect-only
  positions in portable code.

 (NUNION list1 list2 ...)
 (NINTERSECTION list1 list2 ...)
 (NSET-EXCLUSIVE-OR list1 list2 ...)
  is permitted to SETF any part, CAR or CDR, of the top-level of
  any of the given lists.

 Rationale:

  This is probably the status quo from a typical implementor's point of view.

 Benefits: 

  Users would be discouraged from taking advantage of subtle details
  of these destructive operations because such details would be explicitly
  not guaranteed to be portable.

  Implementations can improve performance of many of the above-mentioned
  functions when they are not under the constraint to implement them
  in a highly constrained fashion. For example, in Symbolics systems,
  DELETE of a cdr-coded list could use the implementation primitive
  %CHANGE-LIST-TO-CONS rather than RPLACD to avoid creating forwarding
  pointers.

  Garbage collection effectiveness can also be improved. For example,
  all of the destructive operations which remove objects (eg, REMF)
  could remove CAR pointers to removed objects which are more volatile
  than the list itself, assisting the garbage collector and thereby
  improving memory usage and paging performance.

 Non-Benefits:

  Users who inadvertently depend on side-effect behavior may be rudely
  surprised when moving between implementations.

  Compatibility with older Lisp dialects is diminished.

 Adoption Cost:

  None. This is the status quo for implementors.

 Conversion Cost:

  This change would not affect programs coded with "good programming
  practice".  That is, only programs which rely on currently undocumented
  features would be in any danger of breaking.  In fact, those programs
  are already in such danger, and this change to the documentation would
  just publicize it.  The clarification would -encourage- good programming
  practice by warning people to only obey the published contract of the
  above-mentioned functions.

  There is, however, no automatic technique for making this check for
  programs already in error. Bugs due to unexpected side-effects are in
  general among the hardest to reckon with.

 Aesthetics:

  Most of these functions implement abstract operations. For example,
  REMPROP implements an abstract operation on a "property list".
  Proper language design should not encourage people to delve below the
  level of abstraction into the nitty gritty.

Discussion:

 Conversation on the Common-Lisp mailing list has made it clear that
 the current situation is not good because it does not make the designers'
 intent clear. The list modification allowed should either be specified,
 or explicitly documented as unspecified and up to the individual
 implementation. If the list modification is specified, then programmers
 can rely on the specification.  If it is unspecified, then implementors
 can take advantage of that to make their implementation more efficient.

 Pitman believes that REMF-DESTRUCTION-UNSPECIFIED:MAKE-EXPLICITLY-VAGUE
 may be better from a purist point of view, but strongly supports 
 REMF-DESTRUCTION-UNSPECIFIED:MAKE-EXPLICITLY-DEFINED
 because of practical considerations related to reliably being able to
 debug portable code, a stated priority of an ``industrial strength'' 
 language such as Common Lisp. The more alike implementations are forced
 to be, the better the chances that something that ran one place will run
 another.

 DLA, who raised this issue, disagrees strongly.  He cites efficiency
 and does not believe performance should be traded off for features
 which reasonable code does not tend to use.  Metering in Symbolics test
 systems have shown that there are performance gains to be had by
 allowing implementations flexibility in these areas.  DLA believes that
 if users want more predictability from these functions, they should
 write private variants that implement whatever predictability they
 require.  Additionally, he argues that the implementation users expect
 is not obviously the one chosen in MAKE-EXPLICITLY-DEFINED.  For
 example, many programmers have used NREVERSE for years and assumed that
 it shuffled list elements rather than CDRs.
 
 DLA points out that MAKE-EXPLICITLY-DEFINED is still lax enough that if
 FOO holds (A B) and you do (SETF (GETF FOO 'A) 'C), FOO could be
 (A C A B).  We should be sure this doesn't get left vague if that
 proposal is adopted.