[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Issue: SEQUENCE-TYPE-LENGTH (version 1) Coercing sequences to vectors of different sizes



I hate to bring up a new issue so late, but maybe this one can be addressed
very quickly, or maybe somebody on the CL-Cleanup list will tell me that it
has already been addressed and I just overlooked it, which would be ideal
as far as I am concerned.


Issue:         SEQUENCE-TYPE-LENGTH

References:    CLtL p.51, p.249, p.260, p.252, p.354
               CONCATENATE, COERCE, MAKE-SEQUENCE, MAP, MERGE

Category:      CLARIFICATION

Edit history:  16-Jun-89, version 1, by Moon

Problem description:

  In several functions that take a type specifier as an argument and create
  a sequence of the specified type, it isn't clear what happens if the type
  specifier has an explicit length that doesn't match the length implied by
  the other arguments.

Proposal (SEQUENCE-TYPE-LENGTH:MUST-MATCH):

  COERCE should signal an error if the new sequence type specifies the
  number of elements and the old sequence has a different length.

  MAKE-SEQUENCE should signal an error if the sequence type specifies the
  number of elements and the size argument is different.

  CONCATENATE should signal an error if the sequence type specifies the
  number of elements and the sum of the argument lengths is different.

  MAP should signal an error if the sequence type specifies the number of
  elements and the minimum of the argument lengths is different.

  MERGE should signal an error if the sequence type specifies the number of
  elements and the sum of the lengths of the two sequence arguments is
  different.

Examples:

  ;; All of the following forms should signal an error
  (coerce '(a b c) '(vector * 4))
  (coerce #(a b c) '(vector * 4))
  (coerce '(a b c) '(vector * 2))
  (coerce #(a b c) '(vector * 2))
  (coerce "foo" '(string 2))
  (coerce #(#\a #\b #\c) '(string 2))
  (coerce '(0 1) '(simple-bit-vector 3))
  (make-sequence '(vector * 2) 3)
  (make-sequence '(vector * 4) 3)
  (concatenate '(vector * 2) "a" "bc")
  (map '(vector * 4) #'cons "abc" "de")
  (merge '(vector * 4) '(1 5) '(2 4 6) #'<)

Rationale:

  If CLtL hadn't overlooked this situation, it's likely that it would have
  said it "is an error".  The best translation of that to ANSI CL error
  terminology seemed to be "should signal".  There doesn't seem to be any
  reason to require signalling this error even in unsafe code.  There
  doesn't seem to be any reason to define this situation to do something
  other than signalling an error, such as ignoring the length in the
  type specifier or forcing the sequence to have the correct length by
  truncating or extending it with elements of implementation-dependent
  value.

Current practice:

  Symbolics Genera 7.2 and 7.4 usually ignore the length in the type
  specifier in the above situations, but sometimes signal an error.
  The type of error signalled is sometimes somewhat random.
  Other implementations were not surveyed.

Cost to Implementors:

  This does not seem like difficult checking to add.  I have not examined
  the code in any implementation to try to evaluate what it would cost.

Cost to Users:

  None.

Cost of non-adoption:

  Aesthetic.

Performance impact:

  Probably small, just have to keep track of the length when dealing
  with sequence type specifiers in safe code.  I have not attempted to
  evaluate the exact impact.

Benefits:

  Less ambiguity in the language specification.  Less deviation among
  implementations, hence fewer porting problems.

Esthetics:

  Since the length field is present in sequence type specifiers, it
  seems unesthetic to ignore it, and even more unesthetic not to say
  what is done with it.

Discussion:

  Moon doesn't know what error condition is appropriate.  TYPE-ERROR
  doesn't seem quite appropriate here.  One idea is not to say, just let it
  be any subtype of ERROR.  Another idea is to produce the result object
  and then signal a TYPE-ERROR that this object doesn't match the
  type-specifier for the result type.

  Cassels points out that two similar operations are defined in CLtL to be
  inconsistent with each other:

  (replace (make-array 4) #(1 2 3)) just picks the shortest length, and
     "the extra elements near the end of the longer subsequence are not
     involved in the operation" so the result is #(1 2 3 NIL)

  #4(1 2 3) duplicates the last element, so it's like #(1 2 3 3)
  #2(1 2 3) "is an error".