[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS (Version 3)



There's been no mail on this topic since May.  My reading of the mail at
that time was that most folks were lukewarm; I've tried to capture that
in the discussion section below.  I've rewritten this proposal to
include the GENERALIZE option and the watered down version in the
discussion. I'm not sure that is right. Oh well --- progress. Usual
caveats apply.

Has anyone had any new thoughts on this topic since May?


!
Issue:        SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS
References:   Sequences (pp245-261), COERCE (p51)
Category:     ENHANCEMENT
Edit history: 05-Feb-87, Version 1 by Touretzky
              28-Apr-87, Version 2 by Pitman (variations on the theme)
              26-Oct-87, Version 3 by Masinter (clean up for release)

Description:

Common Lisp provides many useful operations on lists and vectors which
don't apply to arrays.

For example, one can FILL a vector with 0's, but not an array. One can
REPLACE the contents of one vector with another, but one can't do this
for arrays.  One can verify that EVERY element of a vector has some
property, but one can't do this for arrays, and so on.

The programmer who wishes to use arrays instead of vectors must give up
most of the useful tools CLtL provides for manipulating sequences, even
though there is no intuitive reason why operations like FILL, REPLACE,
and EVERY shouldn't work on arrays.

Common Lisp already provides a facility called "displaced arrays" which
can be used to overlay one array on top of a portion of another, even if
the two are of different ranks, so that the two share storage or use the
awkward convention of creating a displaced array to the operand.
However, creating a displaced array merely to perform FILL, REPLACE or
EVERY is awkward.

Proposal (SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:GENERALIZE):

1) Modify the definition of COERCE to allow the coercion of an array to
a vector and vice versa. In keeping with p51 of CLtL, it should be an
error if the result type has a different number of elements than the
object being coerced.

2) Modify the definition of MAKE-SEQUENCE to accept array descriptions
as well as vector descriptions.

3) Extend the definitions of sequence functions that either return their
argument sequences or return non-sequences so that they also allow
arrays. These functions should treat array arguments as vectors
displaced to the array storage (in row-major format). The affected
functions are LENGTH, ELT, COUNT, FIND, POSITION, SOME, EVERY, NOTANY,
NOTEVERY, REDUCE, SEARCH, MISMATCH, FILL, REPLACE, NSUBSTITUTE,
NREVERSE, SORT.

For example, suppose A is a 3x2x7 array. (LENGTH A) should return 42,
and (ELT A 7) should return the same element as (AREF A 0 1 0).  :START
and :END keywords would be interpreted relative to the vector, as would
the results returned by POSITION and SEARCH.

Extend the definitions of sequence functions whose result should be the
same shape as but not necessarily EQ to some argument. These functions
should deal with array arguments by returning an array of the same
shape. The affected functions are SUBSTITUTE, REVERSE, and MAP.

Expressly forbid arrays as arguments to sequence functions that modify
the number of elements in the array because the shape would be
undefined. These functions are SUBSEQ, COPY-SEQ, CONCATENATE, MERGE,
REMOVE, REMOVE-DUPLICATES, DELETE, DELETE-DUPLICATES.

Note that EQUALP does -not- treat arrays as vectors.  It is not a
sequence function, and it already has a well-defined behavior for
arrays. To test whether the arrays A and B, of different shapes, have
the same elements, one would write:

	(AND (= (LENGTH A) (LENGTH B)) (EVERY #'EQL A B)).

Rationale:
  
This proposal would expand rather than interfere with existing practice.
  
Since displaced arrays are already part of Common Lisp, the cost of the
proposed changes would be very low.
  
If the change is not adopted, Common Lisp programmers who wish to use
arrays will have two choices.  Either they must write nested DO loops
every time they want to perform an array operation equivalent to FILL,
REPLACE, REDUCE, etc., or else they can build displaced vectors by hand
and pass them to the sequence functions when necessary.
  
Adoption Cost:
  
This would involve a lot of changes to functions, but all of them
presumably minor. The presence of displaced arrays in the language
already guarantees that the internal storage format needed to back up
these proposed changes is already in place.
  
Note that simply extending COERCE, MAKE-SEQUENCE, LENGTH, ELT, and the
SETF expander for ELT would have the side effect of extending the
remaining functions if they are written in the obvious way.

For example:
  
(DEFUN SUBSTITUTE (NEW-ITEM OLD-ITEM SEQUENCE &KEY ...)
    (LET ((RESULT (MAKE-SEQUENCE (TYPE-OF SEQUENCE))))
     (DOTIMES (I (LENGTH SEQUENCE) RESULT)
	 (SETF (ELT RESULT I)
	       (IF (EQL (ELT SEQUENCE I) OLD-ITEM) 
	 	   NEW-ITEM
		   (ELT SEQUENCE I))))))
  
Benefits:
  
Users of arrays do not have to use home-grown utilities to duplicate
functionality already primitively provided to users of arrays. The
sequence functions become useful in a variety of new situations.
  
Conversion Cost:
  
This change is `upward compatible.' User code should run unmodified.
  
Aesthetics:
  
This proposal extends sequence functions to cover arrays while neatly
avoiding the temptation to turn Common Lisp into a half-baked APL.
Instead of trying to provide a full set of array handling primitives
(which would be needed to take arbitrary k-dimensional slices out of
n-dimensional arrays, or to apply an operator across a specific
dimension of a multidimensional array), it requires just one rule:

    treat arrays as displaced vectors.

Current Practice:

This is unlikely to be implemented anywhere since it has  only very
recently been proposed.
 
Discussion: 

This issue was discussed by the cleanup committee at length; this is
only a brief summary of the discussion.

An alternative considered was to only affect those functions which
didn't explicitly depend on the shape of the array; that is, to modify
COUNT, SOME, EVERY, NOTANY, NOTEVERY, FILL, REPLACE, SUBSTITUTE,
NSUBSTITUTE, and MAP, and expressly forbid arrays as arguments to other
sequence functions, including LENGTH, ELT, FIND, POSITION, REDUCE,
SEARCH,  MISMATCH, REVERSE, NREVERSE, SORT, MAP, as well as SUBSEQ,
COPY-SEQ, CONCATENATE, MERGE, REMOVE, REMOVE-DUPLICATES, DELETE,
DELETE-DUPLICATES. This would be less controversial, since it includes
only those functions which do not deal with positional information. Some
hedging over REDUCE and FIND, which often have non-positional uses, were
considered.

Touretzky supports SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:GENERALIZE. He's
been building displaced vectors to pass to sequence functions when
necessary and really dislikes it.

We considered but discarded as unworkable an alternative where POSITION
and FIND might deal with "positions" as lists of array subscripts.

Another alternative considered would be to only adopt the COERCE change,
and require users who want to operate on arrays explictly perform, e.g.,
(COUNT item (COERCE x 'SEQUENCE)). This alternative would have the
lowest adoption cost; perhaps some implementations could optimize such
coercions.

The general reason for opposing this change is that it adds more
mechanism than it is worth. The general reason for liking it is that it
adds generality at little cost.