[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS (Version 4)



(Dave, in looking over the mail on this, I see that we neglected to cc
you on our discussion. I intended to, but apparently forgot, to include
the original submitter in subsequent discussion.)

I've tried to respond to Moon's points. I was reluctant to add (at a
late date) the ability to coerce a list to a non-vector array and vice
versa, because there are several possible natural interpretations.
(E.g., should (COERCE '#2A((A B) (C D)) 'LIST) => '(A B C D) or '((A B)
(C D))? Rather than opening that can of worms unnecessarily, I thought
it best to stick to the problem at hand...

I don't think we've justified extending COERCE to allow it to *increase*
the rank; all of the arguments were addressed toward merely allowing
(COERCE <array> 'VECTOR).  I changed point 1 to say this, hopefully less
ambiguously than before.

I agree with David's analysis of point 2 and removed it, (re)numbering
the subsequent points.

NREVERSE, NSUBSTITUTE were miscategorized. The  (N)SUBSTITUTE-IF(-NOT)
functions were missing.


!
Issue:        SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS
References:   Sequences (pp245-261), COERCE (p51)
              Issue REMF-DESTRUCTURING-UNSPECIFIED
                (discussion of NREVERSE, NSUBSTITUTE)
Category:     ENHANCEMENT
Edit history: 05-Feb-87, Version 1 by Touretzky
              28-Apr-87, Version 2 by Pitman (variations on the theme)
              26-Oct-87, Version 3 by Masinter (clean up for release)
              11-Nov-87, Version 4 by Masinter (respond to comments)

Description:

Common Lisp provides many useful operations on lists and vectors which
don't apply to arrays.

For example, one can FILL a vector with 0's, but not an array. One can
REPLACE the contents of one vector with another, but one can't do this
for arrays.  One can verify that EVERY element of a vector has some
property, but one can't do this for arrays, and so on.

The programmer who wishes to use arrays instead of vectors must give up
most of the useful tools CLtL provides for manipulating sequences, even
though there is no intuitive reason why operations like FILL, REPLACE,
and EVERY shouldn't work on arrays.

Common Lisp already provides a facility called "displaced arrays" which
can be used to overlay one array on top of a portion of another, even if
the two are of different ranks, so that the two share storage or use the
awkward convention of creating a displaced array to the operand.
However, creating a displaced array merely to perform FILL, REPLACE or
EVERY is awkward.

Proposal (SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:GENERALIZE):

1) Generalize the definition of COERCE so that an array of any rank can
be coerced to type VECTOR or to type SEQUENCE. The result of (COERCE
<array> 'VECTOR) or (COERCE <array> 'SEQUENCE) when <array> is an array
of rank other than 1 is a vector displaced to the original array,
equivalent to  

(MAKE-ARRAY (ARRAY-TOTAL-SIZE <array>)
   :ELEMENT-TYPE (ARRAY-ELEMENT-TYPE <array>)
   :DISPLACED-TO <array>).
 

2) Extend the definitions of sequence functions that either return their
argument sequences or return non-sequences so that they also allow
arrays of rank other than 1. These functions should treat array
arguments as vectors displaced to the array storage (in row-major
format). The affected functions are LENGTH, ELT, COUNT, FIND, POSITION,
SOME, EVERY, NOTANY, NOTEVERY, REDUCE, SEARCH, MISMATCH, FILL, REPLACE,
SORT.

For example, suppose A is a 3x2x7 array. (LENGTH A) should return 42,
and (ELT A 7) should return the same element as (AREF A 0 1 0).  :START
and :END keywords would be interpreted relative to the vector, as would
the results returned by POSITION and SEARCH.

3) Extend the definitions of sequence functions whose result should be
the same shape as but not necessarily EQ to some argument. These
functions should deal with array arguments by returning an array of the
same shape as the argument, and operate on their argument in row-major
order. The affected functions are SUBSTITUTE, NSUBSTITUTE, REVERSE,
NREVERSE, SUBSTITUTE-IF, NSUBSTITUTE-IF, SUBSTITUTE-IF-NOT,
NSUBSTITUTE-IF-NOT and MAP.

4) Sequence functions that modify the number of elements in the array
remain unchanged: it is an error to pass arrays of rank other than 1.
(The functions are not extended because the shape would be undefined.)
The unaffected functions are SUBSEQ, COPY-SEQ, CONCATENATE, MERGE,
REMOVE, REMOVE-DUPLICATES, DELETE, DELETE-DUPLICATES.

Note that EQUALP does -not- treat arrays as vectors.  It is not a
sequence function, and it already has a well-defined behavior for
arrays. To test whether the arrays A and B, of different shapes, have
the same elements, one might write:

	(AND (= (LENGTH A) (LENGTH B)) (EVERY #'EQL A B)).

Rationale:
  
This is a useful upward compatible extension with relatively low
adoption cost.
  
Adoption Cost:
  
This would involve a lot of changes to functions, but all of the changes
are likely to be minor. The presence of displaced arrays in the language
already guarantees that the internal storage format needed to back up
these proposed changes is already in place. 

Benefits:
  
This proposal adds natural support for multi-dimensional arrays.
Currently, users must write nested DO loops every time they want to
perform an array operation equivalent to FILL, REPLACE, REDUCE, etc., or
else they build displaced vectors by hand and pass them to the sequence
functions when necessary. With this proposal,users of arrays do not have
to use home-grown utilities to duplicate functionality already
primitively provided to users of arrays. The sequence functions become
useful in a variety of new situations.
  
Conversion Cost:
  
This change is upward compatible; current user code should run
unmodified.
  
Aesthetics:
  
This proposal extends sequence functions to cover arrays while neatly
avoiding the temptation to turn Common Lisp into a half-baked APL.
Instead of trying to provide a full set of array handling primitives
(which would be needed to take arbitrary k-dimensional slices out of
n-dimensional arrays, or to apply an operator across a specific
dimension of a multidimensional array), it requires just one rule:

    treat arrays as displaced vectors where it is well-defined.

Current Practice:

We know of no current implementation of this proposal.
 
Discussion: 

This issue was discussed by the cleanup committee at length; what
follows is only a brief summary of the discussion.

An alternative considered was to only affect those functions which
didn't explicitly depend on the shape of the array; that is, to modify
COUNT, SOME, EVERY, NOTANY, NOTEVERY, FILL, REPLACE, SUBSTITUTE,
NSUBSTITUTE, and MAP, and expressly forbid arrays as arguments to other
sequence functions, including LENGTH, ELT, FIND, POSITION, REDUCE,
SEARCH,  MISMATCH, REVERSE, NREVERSE, SORT, MAP, as well as SUBSEQ,
COPY-SEQ, CONCATENATE, MERGE, REMOVE, REMOVE-DUPLICATES, DELETE,
DELETE-DUPLICATES. This would be less controversial, since it includes
only those functions which do not deal with positional information. Some
hedging over REDUCE and FIND, which often have non-positional uses, were
considered.

Touretzky supports SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:GENERALIZE. He's
been building displaced vectors to pass to sequence functions when
necessary and really dislikes it.

We considered but discarded as unworkable an alternative where POSITION
and FIND might deal with "positions" as lists of array subscripts.

Another alternative considered would be to only adopt the COERCE change,
and require users who want to operate on arrays explictly perform, e.g.,
(COUNT item (COERCE x 'SEQUENCE)). This alternative would have the
lowest adoption cost; perhaps some implementations could optimize such
coercions.

The general reason for opposing this change is that it adds more
mechanism than it is worth. The general reason for liking it is that it
adds generality at little cost.