[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Issue: ARRAY-TYPE-ELEMENT-TYPE-SEMANTICS (Version 3)
- To: firstname.lastname@example.org
- Subject: Issue: ARRAY-TYPE-ELEMENT-TYPE-SEMANTICS (Version 3)
- From: Jon L White <email@example.com>
- Date: Thu, 2 Jun 88 22:06:32 PDT
Following version omits the "FLUSH-UPGRADING" proposal, which no one on this
committee offered to support. Additionally, it has a re-organization of the
discussion (as per moon's suggestion) now that only one proposal is present.
It doesn't attack the COMPLEX problem. I think biting off array element-type
semantics is enough for now; if it passes, then one of us will submit a
COMPLEX-ELEMENT-TYPE-SEMANTICS issue; if it fails, then COMPLEX will be moot.
References: Data types and Type specifiers: CLtL p. 11; Sect. 4.5, p.45
TYPEP and SUBTYPEP; CLtL Sect. 6.2.1, p.72
ARRAY-ELEMENT-TYPE, CLtL p. 291
Edit history: Version 1, 13-May-88, JonL
Version 2, 23-May-88, JonL
(typo fixes, comments from moon, rearrange some discussion)
Version 3, 02-Jun-88, JonL
(flush alternate proposal ["flush-upgrading"]; consequently,
move more of discussion back to discussion section)
CLtL draws a distinction between type-specifiers "for declaration" and
"for discrimination" in the case of array type specifiers -- i.e., those
that are subtypes of ARRAY [this implicitly includes SIMPLE-ARRAY and
VECTOR]. Many people are confused by this situation, which may be one of
the more insidious flaws in the current CL design.
A consequence of this "flaw" is that a variable declared to be of type
and all of whose assigned objects are created in accord with that type,
still may have *none* of its values ever satisfied by the TYPEP predicate
with that type-specifier. One type-specifier with this property in most
implementations of CL is <certain-type> = (ARRAY (SIGNED-BYTE 5)).
An array created with this type specifier will, depending on the vendor,
either be of typep (ARRAY (SIGNED-BYTE 8)) or (ARRAY T); but (almost)
never of typep (ARRAY (SIGNED-BYTE 5)).
[Notation: because of the inability to show meta- or linguistic variables
in a different typeface, some code examples will use a backquote format;
a form like ",x" will simply mean "substitute the value of x here".
and a form like `(...) will simply mean "evaluate this first, doing
the indicated comma substitutions." Also, the term "type-equivalent"
will be used to denote two types which are each subtypes of the other;
this relation establishes a partition on the set of type-specifiers
into disjoint sets, and members of the same equivalence class are
simply different names for the same type.]
Let aet-x and aet-y be two distinct type specifiers that are definitely
not type-equivalent in a given implementation, but for which make-array
will return an object of the same primitive type; let this primitive
type be described by `(array ,aet-y). This will be an implementation
dependent search, but in every implementation that the present writer
(JonL) has tested, there will be some such types; often, (signed-byte 5)
and (signed-byte 8) will work. Thus in each case, it should be true that:
(array-element-type (make-array 0 :element-type ',aet-x)) ==> ,aet-y
(array-element-type (make-array 0 :element-type ',aet-y)) ==> ,aet-y
and if such a function exists, then:
(upgrade-array-element-type ',aet-x) ==> ,aet-y
(upgrade-array-element-type ',aet-y) ==> ,aet-y
Now for a first set of tests; to eliminate the distinction between "for
declaration" and "for discrimination" both of the following should be true:
`(typep (make-array 0 :element-type ',aet-x)
`(typep (make-array 0 :element-type ',aet-y)
Since `(array ,aet-x) and `(array ,aet-y) are different names for exactly
the same set of objects -- those arrays specialized to hold elements of
type aet-y -- then if the correlation between type-names and sets of objects
is being observed, these names should be type-equivalent. That means that
both of the following tests should be true:
`(subtypep '(array ,aet-x) '(array ,aet-y))
`(subtypep '(array ,aet-y) '(array ,aet-x))
Additionally, to show that un-equivalent type-specifiers that are upgraded
to the same "element type" should be equivalent as element-type specifiers,
then both the following type tests should be true:
`(typep (make-array 0 :element-type ',aet-y)
`(typep (make-array 0 :element-type ',aet-x)
-- Delete all the documentation that suggests the :element-type argument
to make-array might not be a satisfactory element-type in the type
specifier for the kind of array produced. Particularly, delete the
documentation that permits a difference between type-specifiers "for
declaration" and "for discrimination; primarily this means parts
of the discussion in CLtL, section 4.5, p. 45 and 46. Include a
statement that `(array ,aet) is a suitable type-specifier for the
array made by `(make-array <dims> :element-type ',aet ...).
-- Change the documentation of ARRAY-ELEMENT-TYPE, CLtL p. 291 by deleting
the sentence begining "This set may be larger than the set requested
when the array was created; for example . . ."
-- Add documentation assuring that the resultant array-element-type
from a call to make-array is independent of any argument to make-array
except for the :element-type argument. Thus upgrading, if done at all,
must be done the same way for complex and multi-dimensional arrays as for
-- Introduce a function UPGRADE-ARRAY-ELEMENT-TYPE, which will reveal how
a particular :element-type argument to make-array will be treated.
Semantically, this would be equivalent to:
(defun upgrade-array-element-type (element-type)
(array-element-type (make-array 0 :element-type element-type)))
but it wouldn't have to cons up the 0-element array first. Also,
the presence of this name, as a documented function in the language,
would serve notice that "upgrading" is an important aspect of any real
implementation of CL arrays.
-- Change TYPEP (and SUBTYPEP) so that `(typep x '(array ,aet)) is true
if and only if `(typep x '(array ,(upgrade-array-element-type aet)))
This proposal legitimizes current practice, and removes the obscure and
un-useful distinction between type-specifiers "for declaration" and "for
Every vendor's implementation that the present writer has queried does
some amount of non-trivial "upgrading" [generally, using (signed-byte 5)
as an :element-type argument to make-array will show it]; this includes
LUCID, VAXLISP, SYMBOLICS, FRANZ, and XEROX. Most implementations fail
tests [A] and [C] part 1, but pass tests [A] and [C] part 2; this is a
consequence of implementing the distinction between "for declaration" and
"for discrimination". Lucid and Xerox both pass test [B], and the other
vendors fail it [it may be that these two vendors have implemented a
partial solution for subtypep, with a view to future extensions.]
Cost to Implementors:
Typically, a small amount of work will be required in every vendor's
implementation of TYPEP and SUBTYPEP; some vendors may have already
done the work, but not fully integrated it.
Cost to Users:
Because of the prevalence of confusion in this area, it seems unlikely that
any user code will have to be changed. In fact, it is more likely that some
of the vendors will cease to get bug reports about make-array returning a
result that isn't of "the obvious type".
Cost of non-adoption:
It will greatly reduce confusion in the user community; the fact that
(make-array <n> :element-type '<aet>) frequently is not of type (array <aet>)
has been very confusing to almost everyone. That is, in practice,
the distinction between "for declaration" and "for discrimination" has
been a disaster.
To get a sense of how the community is confused, see the arpanet mailing
list for Common Lisp, in a series of exchanges started on Thu,
17 Dec 87 10:48:05 PST by Jeff Barnett <firstname.lastname@example.org>
under the subject line of "Types in CL". Also see the exchange started
Wed, 6 Jan 88 23:21:16 PST by Jon L White <email@example.com>
under the subject line of "TYPEP warp implications"
A common feeling among Common Lisp users is that the set of type-specifiers
forms a mathematical language, and the SUBTYPEP relation is generated by some
basic, generating set of relations between these names. People want to see
this language actually describing the implementation in which it is running
-- that the objects of the implementation are a model for that language --
rather than seeing it be limited to some theoretical model which no vendor
bothers to implement. Hence, typep and subtypep should be changed to
reflect what kinds of arrays get built. [Alternatively, make-array should
be changed to exhibit the distinctions that typep makes.]
Many senior implementors at Lucid favor this proposal. In network mail
"Date: Sat, 9 Jan 1988 15:54 EST" Rob McLaughlin favored the gist of this
proposal -- namely that upgrading should be continued, and that TYPEP should
be fixed. Here is an excerpt of his words:
There are two obvious solutions, one of which has already been proposed:
-- Make the mapping more direct. Eliminating "array element type
upgrading" would be an instance of this.
-- Make the mapping explicitly ill-defined in a more useful way.
I think we need to do the latter because the former won't work. I don't
see how to eliminate "array element type upgrading" without creating
more problems [than] we solve. . . .
My conclusion is that it isn't array types that are wrong, it is the
understanding of the meaning of TYPEP that is wrong.
Many persons are in favor of the permission to upgrade; but they would not
like to see CL become a language like C where there are a prescribed set of
kinds of arrays that must be implemented (e.g, "int", "long int", "single",
"double" etc), and no others can exist. In short, no one would want to gain
portability at the expense of limiting the language to the architectural
features of the hardware on which it was first implemented.
One possible criticism of this proposal is that it hinders portability by
exposing an implementation dependent facet, namely the "upgrading" function.
But the kind of portabililty achieved by a programmer ignorant of the
realities of array implementations -- that only a finite number of array
type classes are really implemented efficiently in any particular CL
implementation -- would be a very pyrrhic victory to say the least. The
kinds of attention a programmer would have to pay to these array type
boundaries is the same kind of attention he now has to pay to the
fixnum/bignum boundary (i.e., arithmetic that is doable in unit time, and
that which requires tens to hundreds of units of time). No one seems to
remember just exactly why this "warp" was put into Common Lisp in the first
place -- drawing a distinction between array type-specifiers "for declaration"
and "for discrimination". Possibly it was added in a mistaken belief that it
would promote portability. ["Mistaken", because the source of non-portability
is the permission to upgrade; if two different implementations do any
upgrading differently, then the effect will be the same kind of
non-portability that results when two different implementations set the
boundary between fixnums and bignums differently.]
Since this proposal contains the implication:
`(array ,aet-x) is-type-equivalent-to `(array ,aet-y)
,aet-x is-type-equivalent-to ,aet-y
then the question naturally arises "Does the reverse implication hold?"
That is, should two non-EQ but type-equivalent type-specifiers ,aet-x
and ,aet-y always give rise to the same array types? For example,
consider SHORT-FLOAT and SINGLE-FLOAT in an implementation where these
are type-equivalent (because only one internal float type is provided for
Lisp objects -- see CLtL section 2.1.3). One may desire to implement
(ARRAY SHORT-FLOAT) and (ARRAY SINGLE-FLOAT) differently. Say, for example
that the former is packed into 16-bit half-words, whereas the latter is
packed into 32-bit words; but for either kind of packing, the result of
AREF is an ordinary "single-float". The whole point is merely to specify
a packing technique for "packed float" arrays. This "krinkle", however,
will not be addressed by the proposals herein; it should simply be
remembered that the implication above goes only one way, and is not
an "if-and-only-if" link.
This proposal arises from focusing on type specifiers as names for
collections of objects, and on SUBTYPEP as being "subsetp" on collections
-- The second paragraph of CLtL p11 makes it clear that "data types" are
just sets of objects; subsequent discussion makes it clear that the
second argument to TYPEP (a type specifier) is a name for some such
set, and that there may be several distinct names specifying the same
set (or type). The first paragraph of section 6.2.1 on page 72 says
that TYPEP is a set membership test, and SUBTYPEP is a subset test.
-- The first two paragraphs of section 4.5, on page 45, describe a
permission for what may be called "element-type upgrading" on arrays;
the documentation for ARRAY-ELEMENT-TYPE on page 291 also makes it
clear that a conforming implementation is permitted to "collapse"
array element types into some more limited set of types, providing
that the array returned by make-array is at most an "upgrade". The
array types STRING and BIT-ARRAY are excluded from upgrading.
For example, depending on how primitive arrays are actually implemented,
(make-array <dims> :element-type '(signed-byte 5))
(make-array <dims> :element-type '(signed-byte 8))
might legitimately create arrays specialized to hold exactly the same set of
objects. The only constraint seems to be that that (array (signed-byte 8))
be the most specific type *** in that implementation *** which can hold
all the arrays made by (make-array <dims> :element-type '(signed-byte 5) ...).
In this case, we say that the array element type has been upgraded from
(signed-byte 5) to (signed-byte 8), and we imply that there is no particular
special provisions for arrays of element type, say, (signed-byte 6).
By the bulletted paragraphs above, (array (signed-byte 5)) and
(array (signed-byte 8)) are in fact names for exactly the same
set of objects.
However pages 45 and 46 go at length to specify that these two different
names for the same set of objects must be treated differently by TYPEP
and SUBTYPEP. This seems to "warp" the utility of TYPEP since it puts
it at variance with the fundamental principle: "SUBTYPEP means subsetp".