[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Issue: ARRAY-TYPE-ELEMENT-TYPE-SEMANTICS (Version 2)



Issue:         ARRAY-TYPE-ELEMENT-TYPE-SEMANTICS

References:    Data types and Type specifiers: CLtL p. 11; Sect. 4.5, p.45
               TYPEP and SUBTYPEP; CLtL Sect. 6.2.1, p.72
               ARRAY-ELEMENT-TYPE, CLtL p. 291

Category:      CHANGE

Edit history:  Version 1, 13-May-88, JonL
	       Version 2, 23-May-88, JonL  
		(typo fixes, comments from moon, rearrange some discussion)

Problem description:

CLtL draws a distinction between type-specifiers "for declaration" and 
"for discrimination" in the case of array type specifiers -- i.e., those 
that are subtypes  of ARRAY [this implicitly includes SIMPLE-ARRAY and 
VECTOR].  Many people are confused by this situation, which may be one of 
the more insidious flaws in the current CL design.

To get a sense of how the community is confused, see the arpanet mailing 
list for Common Lisp, in a series of exchanges started on Thu, 
17 Dec 87 10:48:05 PST by Jeff Barnett <jbarnett@nrtc.northrop.com>
under the subject line of "Types in CL".  Also see the exchange started 
Wed, 6 Jan 88 23:21:16 PST by Jon L White <edsel!jonl@labrea.stanford.edu>
under the subject line of "TYPEP warp implications"

A consequence of this "flaw" is that a variable declared to be of type 
		<certain-type>
and all of whose assigned objects are created in accord with that type,
still may have *none* of its values ever satisfied by the TYPEP predicate 
with that type-specifier.  One type-specifier with this property in most 
implementations of CL is  <certain-type>  =  (ARRAY (SIGNED-BYTE 5)).
An array created with this type specifier will, depending on the vendor,
either be of typep (ARRAY (SIGNED-BYTE 8)) or (ARRAY T); but (almost) 
never of typep (ARRAY (SIGNED-BYTE 5)).

A common feeling among Common Lisp users is that the set of type-specifiers
forms a mathematical language, and the SUBTYPEP relation is generated by some 
basic, generating set of relations between these names.  People want to see
this language actually describing the implementation in which it is running 
-- that the objects of the implementation are a model for that language -- 
rather than seeing it be limited to some theoretical model which no vendor 
bothers to implement.   Hence, typep and subtypep should be changed to
reflect what kinds of arrays get built; alternatively, make-array should
be changed to exhibit the distinctions that typep makes.

No one seems to remember just exactly why this "warp" was put into Common
Lisp in the first place; possibly it was added in a mistaken belief that 
it would promote portability. ["Mistaken", because the source of non-
portability is the permission to upgrade; if two different implementations
do any upgrading differently, then the effect will be the same kind of 
non-portability that results when two different implementations set the 
boundary between fixnums and bignums differently.]  However, many now feel
it would be much simpler if the real source of the problem that led to 
this split up could be resolved.  Two proposals will be offered, each 
rejecting this odd distinction, but in different ways.  


[Notation: because of the inability to show meta- or linguistic variables
in a different typeface, some code examples will use a backquote format;
a form like ",x" will simply mean "substitute the value of x here".
and a form like `(...) will simply mean "evaluate this first, doing
the indicated comma substitutions."  Also, the term "type-equivalent"
will be used to denote two types which are each subtypes of the other;
this relation establishes a partition on the set of type-specifiers
into disjoint sets, and members of the same equivalence class are 
simply different names for the same type.]


One interesting krinkle in this issue is the question as to whether two 
non-EQ but type-equivalent type-specifiers ,aet-x and ,aet-y could give 
rise to different array types.  Remember that `(array ,aet-x) and 
`(array ,aet-y) are type-equivalent only if ,aet-x is type-equivalent 
to ,aet-y.  But this doesn't say anything about the reverse implication: 
does ,aet-x being type-equivalent to ,aet-y necessarily imply that 
`(array ,aet-x) is type-equivalent to `(array ,aet-y)?  For example, 
consider SHORT-FLOAT and SINGLE-FLOAT in an implementation where these 
are type-equivalent (because only one internal float type is provided for
Lisp objects -- see CLtL section 2.1.3).  One may desire to implement 
(ARRAY SHORT-FLOAT) and (ARRAY SINGLE-FLOAT) differently; say, for example 
that the former is packed into 16-bit half-words, whereas the latter is 
packed into 32-bit words.  But for either kind of packing, the result of 
AREF is an ordinary "single-float"; the whole point is merely to specify 
a packing technique for "packed float" arrays.  This "krinkle", however,
will not be addressed by the proposals herein; it should simply be
remembered that the implication goes only one way:

  `(array ,aet-x)  is-type-equivalent-to  `(array ,aet-y)
  ==> 
   ,aet-x  is-type-equivalent-to  ,aet-y



Test case: 

Let aet-x and aet-y be two distinct type specifiers that are definitely 
not type-equivalent in a given implementation, but for which make-array
will return an object of the same primitive type; let this primitive
type be described by `(array ,aet-y).  This will be an implementation 
dependent search, but in every implementation that the present writer 
(JonL) has tested, there will be some such types; often, (signed-byte 5)
and (signed-byte 8) will work.  Thus in each case, it should be true that:

  (array-element-type (make-array 0 :element-type ',aet-x))  ==> ,aet-y
  (array-element-type (make-array 0 :element-type ',aet-y))  ==> ,aet-y

and if such a function exists, then:

  (upgrade-array-element-type ',aet-x)  ==> ,aet-y
  (upgrade-array-element-type ',aet-y)  ==> ,aet-y


Now for a first set of tests; to eliminate the distinction between "for 
declaration" and "for discrimination" both of the following should be true:

  [A]
   `(typep (make-array 0 :element-type ',aet-x)
           '(array ,aet-x))
   `(typep (make-array 0 :element-type ',aet-y)
           '(array ,aet-y))

Since `(array ,aet-x) and `(array ,aet-y) are different names for exactly
the same set of objects -- those arrays specialized to hold elements of
type aet-y -- then if the correlation between type-names and sets of objects
is being observed, these names should be type-equivalent.  That means that
both of the following tests should be true:

  [B]
   `(subtypep '(array ,aet-x) '(array ,aet-y))
   `(subtypep '(array ,aet-y) '(array ,aet-x))

Additionally, to show that un-equivalent type-specifiers that are upgraded
to the same "element type" should be equivalent as element-type specifiers,
then both the following type tests should be true:

  [C]
   `(typep (make-array 0 :element-type ',aet-y)
           '(array ,aet-x))
   `(typep (make-array 0 :element-type ',aet-x)
           '(array ,aet-y))



Proposal (ARRAY-TYPE-ELEMENT-TYPE-SEMANTICS:UNIFY-UPGRADING)

-- Delete all the documentation that suggests the :element-type argument
   to make-array might not be a satisfactory element-type in the type
   specifier for the kind of array produced.  Particularly, delete the
   documentation that permits a difference between type-specifiers "for
   declaration" and "for discrimination;  primarily this means parts
   of the discussion in CLtL, section 4.5, p. 45 and 46.  Include a
   statement that `(array ,aet) is a suitable type-specifier for the
   array made by `(make-array <dims> :element-type ',aet ...).

-- Change the documentation of ARRAY-ELEMENT-TYPE, CLtL p. 291 by deleting
   the sentence begining "This set may be larger than the set requested
   when the array was created; for example . . ."

-- Add documentation assuring that the resultant array-element-type
   from a call to make-array is independent of any argument to make-array
   except for the :element-type argument.

-- Introduce a function UPGRADE-ARRAY-ELEMENT-TYPE, which will reveal how
   a particular :element-type argument to make-array will be treated.
   Semantically, this would be equivalent to:
     (defun upgrade-array-element-type (element-type)
       (array-element-type (make-array 0 :element-type element-type)))
   but it wouldn't have to cons up the 0-element array first.  Also,
   the presence of this name, as a documented function in the language, 
   would serve notice that "upgrading" is an important aspect of any real
   implementation of CL arrays.

-- Change TYPEP (and SUBTYPEP) so that `(typep x '(array ,aet)) is true
   if and only if `(typep x '(array ,(upgrade-array-element-type aet)))
   is true.


Rationale:

This proposal legitimizes current practice, and removes the obscure and 
un-useful distinction  between type-specifiers "for declaration" and "for
discrimination".  


Current Practice:

Every vendor's implementation that the present writer has queried does 
some amount of non-trivial "upgrading" [generally, using (signed-byte 5)
as an :element-type argument to make-array will show it]; this includes 
LUCID, VAXLISP, SYMBOLICS, FRANZ, and XEROX. Most implementations fail 
tests [A] and [C] part 1, but pass tests [A] and [C] part 2; this is a 
consequence of implementing the distinction between "for declaration" and 
"for discrimination".  Lucid and Xerox both pass test [B], and the other 
vendors fail it [it may be that these two vendors have implemented a 
partial solution for subtypep, with a view to future extensions.]


Cost to Implementors:

Typically, a small amount of work will be required in every vendor's
implementation of TYPEP and SUBTYPEP; some vendors may have already
done the work, but not fully integrated it.


Cost to Users:

Because of the prevalence of confusion in this area, it seems unlikely that
any user code will have to be changed.  In fact, it is more likely that some
of the vendors will cease to get bug reports about make-array returning a
result that isn't of "the obvious type".


Cost of non-adoption:

See Benefits.


Benefits:

It will greatly reduce confusion in the user community; the fact that 
(make-array <n> :element-type '<aet>) frequently is not of type (array <aet>)
has been very confusing to almost everyone.   That is, in practice,
the distinction between "for declaration" and "for discrimination" has
been a disaster.


Esthetics:


Discussion:

Many persons are in favor of the permission to upgrade; but they would not 
like to see CL become a language like C where there are a prescribed set of 
kinds of arrays that must be implemented (e.g, "int", "long int", "single",
"double" etc), and no others can exist.  In short, no one would want to gain 
portability at the expense of limiting the language to the architectural
features of the hardware on which it was first implemented.

Many senior implementors at Lucid favor this proposal.  In network mail 
"Date: Sat, 9 Jan 1988  15:54 EST" Rob McLaughlin favored the gist of this 
proposal -- namely that upgrading should be continued, and that TYPEP should 
be fixed.  Here is an excerpt of his words:
    There are two obvious solutions, one of which has already been proposed:
     -- Make the mapping more direct.  Eliminating "array element type
        upgrading" would be an instance of this.
     -- Make the mapping explicitly ill-defined in a more useful way.
    I think we need to do the latter because the former won't work.  I don't
    see how to eliminate "array element type upgrading" without creating
    more problems [than] we solve.  . . . 

    My conclusion is that it isn't array types that are wrong, it is the
    understanding of the meaning of TYPEP that is wrong. 


One possible criticism of this proposal is that it hinders portability by 
exposing an implementation dependent facet, namely the "upgrading" function.
But the kind of portabililty achieved by a programmer ignorant of the 
realities of array implementations -- that only a finite number of array 
type classes are really implemented efficiently in any particular CL 
implementation  -- would be a very pyrrhic victory to say the least.  The 
kinds of attention a programmer would have to pay to these array type 
boundaries is the same kind of attention he now has to pay to the 
fixnum/bignum boundary (i.e., arithmetic that is doable in unit time, and 
that which requires tens to hundreds of units of time).


This proposal arises from focusing on type specifiers as names for 
collections of objects, and on SUBTYPEP as being "subsetp" on collections 
of objects.

  -- The second paragraph of CLtL p11 makes it clear that "data types" are
     just sets of objects; subsequent discussion makes it clear that the
     second argument to TYPEP (a type specifier) is a name for some such
     set, and that there may be several distinct names specifying the same
     set (or type).  The first paragraph of section 6.2.1 on page 72 says
     that TYPEP is a set membership test, and SUBTYPEP is a subset test.

  -- The first two paragraphs of section 4.5, on page 45, describe a
     permission for what may be called "element-type upgrading" on arrays; 
     the  documentation for ARRAY-ELEMENT-TYPE on page 291 also makes it 
     clear that a conforming implementation is permitted to "collapse" 
     array element types into some more limited set of types, providing 
     that the array returned by make-array is at most an "upgrade".  The
     array types STRING and BIT-ARRAY are excluded from upgrading.

For example, depending on how primitive arrays are actually implemented, 

       (make-array <dims> :element-type '(signed-byte 5))

and

       (make-array <dims> :element-type '(signed-byte 8))

might legitimately create arrays specialized to hold exactly the same set of
objects.  The only constraint seems to be that that (array (signed-byte 8))
be the most specific type  *** in that implementation *** which can hold
all the arrays made by (make-array <dims> :element-type '(signed-byte 5) ...).
In this case, we say that the array element type has been upgraded from
(signed-byte 5) to (signed-byte 8), and we imply that there is no particular 
special provisions for arrays of element type, say, (signed-byte 6).

By the bulletted paragraphs above, (array (signed-byte 5)) and
(array (signed-byte 8)) are in fact names for exactly the same
set of objects.

However pages 45 and 46 go at length to specify that these two different
names for the same set of objects must be treated differently by TYPEP
and SUBTYPEP.  This seems to "warp" the utility of TYPEP since it puts
it at variance with the fundamental principle: "SUBTYPEP means subsetp".






Proposal (ARRAY-TYPE-ELEMENT-TYPE-SEMANTICS:FLUSH-UPGRADING)


-- Delete all the documentation that suggests the :element-type argument
   to make-array might not be a satisfactory element-type in the type
   specifier for the kind of array produced.  Particularly, delete the
   documentation that permits a difference between type-specifiers "for
   declaration" and "for discrimination;  primarily this means parts
   of the discussion in CLtL, section 4.5, p. 45 and 46.  Include a
   statement that `(array ,aet) is a suitable type-specifier for the
   array made by `(make-array <dims> :element-type ',aet ...).  Include
   also a statement that if `(array ,aet-x) and `(array ,aet-y)	 are
   type-equivalent, then it must be the case that ,aet-x and ,aet-y are
   also type-equivalent.

-- Change the documentation of ARRAY-ELEMENT-TYPE, CLtL p. 291 by deleting
   the sentence begining "This set may be larger than the set requested
   when the array was created; for example . . ."

-- Change the implementation of arrays so that a canonical member of
   the equivalence class around the :element-type argument is always
   remembered; ARRAY-ELEMENT-TYPE should return this canonical member.
   

Rationale:

This makes for maximum portability; type specifiers are "remembered"
exactly as the user typed them in (up to type-equivalence transformations).


Current Practice:

No vendor implements this.


Cost to Implementors:

Array implementations would have to be changed in every vendor; for some,
this could be an enormous amount of work.  In some implementations, the 
"cheapest" primitive arrays might become considerably more costly.  Since 
arrays seem to be at the base of every Common Lisp system, the effects of 
this change could be widespread, and costly to fix.


Cost to Users:

Because of the prevalence of confusion in this area, it seems unlikely that
any user code will have to be changed.  However, knowing what :element-type
specifiers are implemented efficiently will likely still be of concern to
most vendor's clients.


Benefits:

Being in love with portability means never having to say you're sorry
about TYPEP.


Esthetics:

Discussion:

The test case is, unfortunately, not relevant to this proposal, since it
presumes that some non-trivial upgrading is going on.  In particular,
there will be no "two, distinct type specifiers ,aet-x and ,aet-x that are 
definitely not type-equivalent, but for which 
  (array-element-type (make-array 0 :element-type ',aet-x))  ==> ,aet-y"

Many people are in favor of this proposal, at least initially.  See the
arpanet interchanges mentioned above.

One senior implementor at Lucid has favored this proposal.

It may be desirable to introduce a new function CANONICAL-TYPE of one
argument, which takes a type-specifier into a canonical type-specifier
for that type-equivalence class.