[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS (Version 2)



Issue:        SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS
References:   Sequences (pp245-261), COERCE (p51)
Category:     ENHANCEMENT
Edit history: 05-Feb-87, Version 1 by Touretzky
	      28-Apr-87, Version 2 by Pitman (variations on the theme)
Status:	      For Internal Discussion

Description:

  Common Lisp provides many useful operations on lists and vectors which
  don't apply to arrays.

  For example, one can FILL a vector with 0's, but not an array. One can
  REPLACE the contents of one vector with another, but one can't do this
  for arrays.  One can verify that EVERY element of a vector has some 
  property, but one can't do this for arrays. And so on.

  The programmer who wishes to use arrays instead of vectors must give up
  most of the useful tools CLtL provides for manipulating sequences, even
  though there is no intuitive reason why operations like FILL, REPLACE,
  and EVERY shouldn't work on arrays.

Notes about Voting:

  Select one of:
    SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:GENERALIZE (by Touretzky)
    SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:MODIFIED   (by Pitman)
    SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:UNCHANGED  (status quo)
  [See also notes in the discussion about the possibility of a fourth
   way to go if you're not satisfied with any of these.]

Proposal (SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:GENERALIZE):

  Common Lisp already provides a facility called "displaced arrays"
  which can be used to overlay one array on top of a portion of another,
  even if the two are of different ranks, so that the two share storage.
  Emphasize this as a way of explaining the behavior of sequence 
  functions on arbitrary arrays.

  Modify the definition of COERCE to allow the coercion of an array to
  a vector and vice versa. In keeping with p51 of CLtL, it should be an
  error if the result type has a different number of elements than the
  object being coerced.

  Modify the definition of MAKE-SEQUENCE to accept array descriptions as
  well as vector descriptions.

  Extend the definitions of sequence functions that either return their
  argument sequences or return non-sequences so that they also allow arrays.
  These functions should treat array arguments as vectors displaced to the
  array storage (in row-major format). The affected functions are LENGTH,
  ELT, COUNT, FIND, POSITION, SOME, EVERY, NOTANY, NOTEVERY, REDUCE, 
  SEARCH, MISMATCH, FILL, REPLACE, NSUBSTITUTE, NREVERSE, SORT.

  For example, suppose A is a 3x2x7 array. (LENGTH A) should return 42,
  and (ELT A 7) should return A[0,1,0].  :START and :END keywords would
  be interpreted relative to the vector, as would the results returned
  by POSITION and SEARCH.

  Extend the definitions of sequence functions whose result should be the
  same shape as but not necessarily EQ to some argument. These functions
  should deal with array arguments by returning an array of the same
  shape. The affected functions are SUBSTITUTE, REVERSE, and MAP.

  Expressly forbid arrays as arguments to sequence functions that modify
  the number of elements in the array because the shape would be undefined.
  These functions are SUBSEQ, COPY-SEQ, CONCATENATE, MERGE, REMOVE,
  REMOVE-DUPLICATES, DELETE, DELETE-DUPLICATES.

  Note that EQUALP does -not- treat arrays as vectors.  It is not a 
  sequence function, and it already has a well-defined behavior for arrays.
  To test whether the arrays A and B, of different shapes, have the same
  elements, one would write:
	(AND (= (LENGTH A) (LENGTH B)) (EVERY #'EQL A B)).

  Rationale:
  
    This proposal would expand rather than interfere with existing practice.
  
    Since displaced arrays are already part of Common Lisp, the cost of the
    proposed changes would be very low.
  
    If the change is not adopted, Common Lisp programmers who wish to use
    arrays will have two choices.  Either they must write nested DO loops
    every time they want to perform an array operation equivalent to FILL,
    REPLACE, REDUCE, etc., or else they can build displaced vectors by
    hand and pass them to the sequence functions when necessary.
  
  Adoption Cost:
  
    This would involve a lot of changes to functions, but all of them
    presumably minor. The presence of displaced arrays in the language
    already guarantees that the internal storage format needed to back
    up these proposed changes is already in place.
  
    Note that simply extending COERCE, MAKE-SEQUENCE, LENGTH, ELT, and
    the SETF expander for ELT would have the side effect of extending
    the remaining functions if they are written in the obvious way.
    For example:
  
	  (DEFUN SUBSTITUTE (NEW-ITEM OLD-ITEM SEQUENCE &KEY ...)
	    (LET ((RESULT (MAKE-SEQUENCE (TYPE-OF SEQUENCE))))
	      (DOTIMES (I (LENGTH SEQUENCE) RESULT)
		(SETF (ELT RESULT I)
		      (IF (EQL (ELT SEQUENCE I) OLD-ITEM) 
			  NEW-ITEM
			  (ELT SEQUENCE I))))))
  
  Benefits:
  
    Users of arrays do not have to use home-grown utilities to duplicate
    functionality already primitively provided to users of arrays. The
    sequence functions become useful in a variety of new situations.
  
  Conversion Cost:
  
    This change is `upward compatible.' User code should run unmodified.
  
  Aesthetics:
  
    This proposal extends sequence functions to cover arrays while neatly
    avoiding the temptation to turn Common Lisp into a half-baked APL.
    Instead of trying to provide a full set of array handling primitives
    (which would be needed to take arbitrary k-dimensional slices out of 
    n-dimensional arrays, or to apply an operator across a specific
    dimension of a multidimensional array), it requires just one rule:
    treat arrays as displaced vectors.
  
Proposal (SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:MODIFIED):

  Common Lisp already provides a facility called "displaced arrays"
  which can be used to overlay one array on top of a portion of another,
  even if the two are of different ranks, so that the two share storage.
  Emphasize this as a way of explaining the behavior of sequence 
  functions on certain arrays.

  Modify the definition of COERCE to allow the coercion of an array to
  a vector and vice versa. In keeping with p51 of CLtL, it should be an
  error if the result type has a different number of elements than the
  object being coerced.

  Extend the definitions of sequence functions that either return their
  argument sequences, return sequences which are the same shape as their
  argument, or return non-sequences so that they also allow arrays iff
  their action is conceptually independent of the shape of the array.
  The affected functions are COUNT, SOME, EVERY, NOTANY, NOTEVERY,
  FILL, REPLACE, SUBSTITUTE, NSUBSTITUTE, and MAP.

  Expressly forbid arrays as arguments to other sequence functions. These
  unaffected functions are LENGTH, ELT, FIND, POSITION, REDUCE, SEARCH,
  MISMATCH, REVERSE, NREVERSE, SORT, MAP, SUBSEQ, COPY-SEQ, CONCATENATE,
  MERGE, REMOVE, REMOVE-DUPLICATES, DELETE, DELETE-DUPLICATES.

  Rationale:
  
    This proposal would expand rather than interfere with existing practice.
  
    Since displaced arrays are already part of Common Lisp, the cost of the
    proposed changes would be very low.
  
    If the change is not adopted, Common Lisp programmers who wish to use
    arrays will have two choices.  Either they must write nested DO loops
    every time they want to perform an array operation equivalent to FILL,
    REPLACE, etc., or else they can build displaced vectors by hand and
    pass them to the sequence functions when necessary.
  
    This proposal extends certain sequence functions in some interesting
    ways without committing us to a theory of how arrays and sequences
    relate that everyone may not be happy with right now.

  Adoption Cost:
  
    This would involve a lot of changes to functions, but all of them
    presumably minor. The presence of displaced arrays in the language
    already guarantees that the internal storage format needed to back
    up these proposed changes is already in place.
  
  Benefits:
  
    Users of arrays do not have to use home-grown utilities to duplicate
    functionality already primitively provided to users of arrays. The
    sequence functions become useful in a variety of new situations.
  
  Conversion Cost:
  
    This change is `upward compatible.' User code should run unmodified.
  
  Aesthetics:
  
    This extends certain existing sequence functions to allow arrays
    as arguments in a fairly non-controversial way, leaving aside the
    larger issue of whether and how to generalize the other sequence
    functions.
  
Proposal (SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:UNCHANGED):

  This is the null proposal for the sake of voting clarity. Vote for this
  if you think things should not change.

Current Practice:

  Neither SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:GENERALIZE nor
  SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:MODIFIED are likely to be implemented
  anywhere since they are only very recently proposed.

Discussion:

  Touretzky supports SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:GENERALIZE.
  He's been building displaced vectors to pass to sequence functions when
  necessary and really dislikes it.

  The members of the Cleanup committee expressed interest in the ideas 
  behind this proposal but weren't sure they could accept it in the
  proposed form. A rewrite to separate some of the issues more clearly
  was solicited.

  Rees suggested that if people are not sure about this a proposal, it might
  be possible to make fly a modified version of the proposal which extended
  only those functions which did not deal with positional information.
  Pitman wrote SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:MODIFIED based on this idea
  and supports at least that much extension, and is sympathetic to (but not
  yet fully committed to the idea of the full proposal).

  Note that in the SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:MODIFIED proposal,
  the function REDUCE is in a grey area. Many of its uses are not
  position-dependent, but some are. The same argument might be made about
  FIND. If people felt strongly, these too could be extended either by
  fudging the conservative rule or by explicit special case(s).

  [It's also possible that a still-more-general proposal might be
   interesting. For example, one that introduced and inverse to 
   ARRAY-ROW-MAJOR-INDEX called ARROW-ROW-MAJOR-SUBSCRIPTS, and have
   functions like POSITION that returned position information or things
   that take :START and :END arguments use subscript (rather than offset)
   information. eg,
    (SETQ A (MAKE-ARRAY '(2 2) :INITIAL-ELEMENT 0))
    (SETF (AREF A 1 0) '1)
    (POSITION 1 A) => (1 0) ;Rather than 2 as suggested above
   This might ease some people's minds if they're just worried that
   returning a 1-d index will feel funny for an any-d array. On the
   other hand, the linear ordering must still be well defined so it'll
   be clear what the search order is, what range is being selected when
   :START and/or :END is used, etc. so you can't hide the issue
   completely. Still, if anyone's interested in a full-blown proposal
   along these lines, they should ask me and I'll write it. --KMP]