[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS (Version 2)
- To: CL-Cleanup@SAIL.STANFORD.EDU
- Subject: SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS (Version 2)
- From: Kent M Pitman <KMP@STONY-BROOK.SCRC.Symbolics.COM>
- Date: Tue, 28 Apr 87 14:13 EDT
- Cc: KMP@STONY-BROOK.SCRC.Symbolics.COM
- References: <12276526020.28.TOURETZKY@C.CS.CMU.EDU>, <870315-164404-1975@Xerox>
Issue: SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS
References: Sequences (pp245-261), COERCE (p51)
Category: ENHANCEMENT
Edit history: 05-Feb-87, Version 1 by Touretzky
28-Apr-87, Version 2 by Pitman (variations on the theme)
Status: For Internal Discussion
Description:
Common Lisp provides many useful operations on lists and vectors which
don't apply to arrays.
For example, one can FILL a vector with 0's, but not an array. One can
REPLACE the contents of one vector with another, but one can't do this
for arrays. One can verify that EVERY element of a vector has some
property, but one can't do this for arrays. And so on.
The programmer who wishes to use arrays instead of vectors must give up
most of the useful tools CLtL provides for manipulating sequences, even
though there is no intuitive reason why operations like FILL, REPLACE,
and EVERY shouldn't work on arrays.
Notes about Voting:
Select one of:
SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:GENERALIZE (by Touretzky)
SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:MODIFIED (by Pitman)
SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:UNCHANGED (status quo)
[See also notes in the discussion about the possibility of a fourth
way to go if you're not satisfied with any of these.]
Proposal (SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:GENERALIZE):
Common Lisp already provides a facility called "displaced arrays"
which can be used to overlay one array on top of a portion of another,
even if the two are of different ranks, so that the two share storage.
Emphasize this as a way of explaining the behavior of sequence
functions on arbitrary arrays.
Modify the definition of COERCE to allow the coercion of an array to
a vector and vice versa. In keeping with p51 of CLtL, it should be an
error if the result type has a different number of elements than the
object being coerced.
Modify the definition of MAKE-SEQUENCE to accept array descriptions as
well as vector descriptions.
Extend the definitions of sequence functions that either return their
argument sequences or return non-sequences so that they also allow arrays.
These functions should treat array arguments as vectors displaced to the
array storage (in row-major format). The affected functions are LENGTH,
ELT, COUNT, FIND, POSITION, SOME, EVERY, NOTANY, NOTEVERY, REDUCE,
SEARCH, MISMATCH, FILL, REPLACE, NSUBSTITUTE, NREVERSE, SORT.
For example, suppose A is a 3x2x7 array. (LENGTH A) should return 42,
and (ELT A 7) should return A[0,1,0]. :START and :END keywords would
be interpreted relative to the vector, as would the results returned
by POSITION and SEARCH.
Extend the definitions of sequence functions whose result should be the
same shape as but not necessarily EQ to some argument. These functions
should deal with array arguments by returning an array of the same
shape. The affected functions are SUBSTITUTE, REVERSE, and MAP.
Expressly forbid arrays as arguments to sequence functions that modify
the number of elements in the array because the shape would be undefined.
These functions are SUBSEQ, COPY-SEQ, CONCATENATE, MERGE, REMOVE,
REMOVE-DUPLICATES, DELETE, DELETE-DUPLICATES.
Note that EQUALP does -not- treat arrays as vectors. It is not a
sequence function, and it already has a well-defined behavior for arrays.
To test whether the arrays A and B, of different shapes, have the same
elements, one would write:
(AND (= (LENGTH A) (LENGTH B)) (EVERY #'EQL A B)).
Rationale:
This proposal would expand rather than interfere with existing practice.
Since displaced arrays are already part of Common Lisp, the cost of the
proposed changes would be very low.
If the change is not adopted, Common Lisp programmers who wish to use
arrays will have two choices. Either they must write nested DO loops
every time they want to perform an array operation equivalent to FILL,
REPLACE, REDUCE, etc., or else they can build displaced vectors by
hand and pass them to the sequence functions when necessary.
Adoption Cost:
This would involve a lot of changes to functions, but all of them
presumably minor. The presence of displaced arrays in the language
already guarantees that the internal storage format needed to back
up these proposed changes is already in place.
Note that simply extending COERCE, MAKE-SEQUENCE, LENGTH, ELT, and
the SETF expander for ELT would have the side effect of extending
the remaining functions if they are written in the obvious way.
For example:
(DEFUN SUBSTITUTE (NEW-ITEM OLD-ITEM SEQUENCE &KEY ...)
(LET ((RESULT (MAKE-SEQUENCE (TYPE-OF SEQUENCE))))
(DOTIMES (I (LENGTH SEQUENCE) RESULT)
(SETF (ELT RESULT I)
(IF (EQL (ELT SEQUENCE I) OLD-ITEM)
NEW-ITEM
(ELT SEQUENCE I))))))
Benefits:
Users of arrays do not have to use home-grown utilities to duplicate
functionality already primitively provided to users of arrays. The
sequence functions become useful in a variety of new situations.
Conversion Cost:
This change is `upward compatible.' User code should run unmodified.
Aesthetics:
This proposal extends sequence functions to cover arrays while neatly
avoiding the temptation to turn Common Lisp into a half-baked APL.
Instead of trying to provide a full set of array handling primitives
(which would be needed to take arbitrary k-dimensional slices out of
n-dimensional arrays, or to apply an operator across a specific
dimension of a multidimensional array), it requires just one rule:
treat arrays as displaced vectors.
Proposal (SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:MODIFIED):
Common Lisp already provides a facility called "displaced arrays"
which can be used to overlay one array on top of a portion of another,
even if the two are of different ranks, so that the two share storage.
Emphasize this as a way of explaining the behavior of sequence
functions on certain arrays.
Modify the definition of COERCE to allow the coercion of an array to
a vector and vice versa. In keeping with p51 of CLtL, it should be an
error if the result type has a different number of elements than the
object being coerced.
Extend the definitions of sequence functions that either return their
argument sequences, return sequences which are the same shape as their
argument, or return non-sequences so that they also allow arrays iff
their action is conceptually independent of the shape of the array.
The affected functions are COUNT, SOME, EVERY, NOTANY, NOTEVERY,
FILL, REPLACE, SUBSTITUTE, NSUBSTITUTE, and MAP.
Expressly forbid arrays as arguments to other sequence functions. These
unaffected functions are LENGTH, ELT, FIND, POSITION, REDUCE, SEARCH,
MISMATCH, REVERSE, NREVERSE, SORT, MAP, SUBSEQ, COPY-SEQ, CONCATENATE,
MERGE, REMOVE, REMOVE-DUPLICATES, DELETE, DELETE-DUPLICATES.
Rationale:
This proposal would expand rather than interfere with existing practice.
Since displaced arrays are already part of Common Lisp, the cost of the
proposed changes would be very low.
If the change is not adopted, Common Lisp programmers who wish to use
arrays will have two choices. Either they must write nested DO loops
every time they want to perform an array operation equivalent to FILL,
REPLACE, etc., or else they can build displaced vectors by hand and
pass them to the sequence functions when necessary.
This proposal extends certain sequence functions in some interesting
ways without committing us to a theory of how arrays and sequences
relate that everyone may not be happy with right now.
Adoption Cost:
This would involve a lot of changes to functions, but all of them
presumably minor. The presence of displaced arrays in the language
already guarantees that the internal storage format needed to back
up these proposed changes is already in place.
Benefits:
Users of arrays do not have to use home-grown utilities to duplicate
functionality already primitively provided to users of arrays. The
sequence functions become useful in a variety of new situations.
Conversion Cost:
This change is `upward compatible.' User code should run unmodified.
Aesthetics:
This extends certain existing sequence functions to allow arrays
as arguments in a fairly non-controversial way, leaving aside the
larger issue of whether and how to generalize the other sequence
functions.
Proposal (SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:UNCHANGED):
This is the null proposal for the sake of voting clarity. Vote for this
if you think things should not change.
Current Practice:
Neither SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:GENERALIZE nor
SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:MODIFIED are likely to be implemented
anywhere since they are only very recently proposed.
Discussion:
Touretzky supports SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:GENERALIZE.
He's been building displaced vectors to pass to sequence functions when
necessary and really dislikes it.
The members of the Cleanup committee expressed interest in the ideas
behind this proposal but weren't sure they could accept it in the
proposed form. A rewrite to separate some of the issues more clearly
was solicited.
Rees suggested that if people are not sure about this a proposal, it might
be possible to make fly a modified version of the proposal which extended
only those functions which did not deal with positional information.
Pitman wrote SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:MODIFIED based on this idea
and supports at least that much extension, and is sympathetic to (but not
yet fully committed to the idea of the full proposal).
Note that in the SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS:MODIFIED proposal,
the function REDUCE is in a grey area. Many of its uses are not
position-dependent, but some are. The same argument might be made about
FIND. If people felt strongly, these too could be extended either by
fudging the conservative rule or by explicit special case(s).
[It's also possible that a still-more-general proposal might be
interesting. For example, one that introduced and inverse to
ARRAY-ROW-MAJOR-INDEX called ARROW-ROW-MAJOR-SUBSCRIPTS, and have
functions like POSITION that returned position information or things
that take :START and :END arguments use subscript (rather than offset)
information. eg,
(SETQ A (MAKE-ARRAY '(2 2) :INITIAL-ELEMENT 0))
(SETF (AREF A 1 0) '1)
(POSITION 1 A) => (1 0) ;Rather than 2 as suggested above
This might ease some people's minds if they're just worried that
returning a 1-d index will feel funny for an any-d array. On the
other hand, the linear ordering must still be well defined so it'll
be clear what the search order is, what range is being selected when
:START and/or :END is used, etc. so you can't hide the issue
completely. Still, if anyone's interested in a full-blown proposal
along these lines, they should ask me and I'll write it. --KMP]