[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Issue: BIT-ARRAY-FUNCTIONS (version 5)
- To: CL-Cleanup@sail.stanford.edu
- Subject: Issue: BIT-ARRAY-FUNCTIONS (version 5)
- From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
- Date: Tue, 23 May 89 14:42 EDT
This is a new issue. It arose from an investigation of features
that are plausibly needed but missing from draft ANSI Common Lisp.
This issue seems sufficiently simple and noncontroversial that
I would like to see it on the agenda for the June X3J13 meeting.
Let's use the cleanup subcommittee to test the assertion that this
is a simple and noncontroversial issue. If it's controversial,
let's just drop it, otherwise let's give X3J13 a chance to vote
for or against it.
Issue: BIT-ARRAY-FUNCTIONS
References: The binary bit-array functions BIT-AND, BIT-IOR, BIT-XOR,
BIT-EQV, BIT-NAND, BIT-NOR, BIT-ANDC1, BIT-ANDC2, BIT-ORC1,
and BIT-ORC2 (CLtL p.294).
The unary bit-array function BIT-NOT (CLtL p.295).
The mapping functions EVERY, MAP, NOTANY, NOTEVERY, and SOME
(CLtL pp.249-250).
The functions COUNT and POSITION (CLtL p.257).
Related issues: none
Category: ADDITION
Edit history: Version 1, 9-May-89, by Moon
Version 2, 10-May-89, by Moon (add second proposal)
Version 3, 12-May-89, by Moon (small wording improvements)
Version 4, 13-May-89, by Moon (make more understandable)
Version 5, 23-May-89, by Moon (fix -p naming convention)
Problem description:
Logical operations on bit vectors have been found to be useful in such
programs as compiler flow analysis. They are easy to implement in
straight Common Lisp, but such an implementation is many times slower
than an optimized implementation on most machines. This is partly
because many machines have instructions to perform these operations or
inner kernels of them, and partly because Common Lisp is not a good
language for implementing this type of low-level bit-oriented operation.
Common Lisp provides some logical operations on bit arrays, but the
provided set is incomplete. Furthermore, the operations that are
provided are only defined for arrays of identical dimensions, making them
less useful for bit vectors that represent sets, where trailing zero
elements are often omitted. Some of the sequence functions are useful
for bit vectors, but users (correctly) fear that their implementation may
be optimized for general sequences, not for bit vectors.
This issue contains two alternative proposals.
Proposal (BIT-ARRAY-FUNCTIONS:ADD):
Allow the binary bit-array functions referenced above to accept arguments
of identical rank but unequal dimensions. Nonexistent elements of
bit-array-1 or bit-array-2 are assumed to be zero. If the third argument
is T or a bit-array, result elements outside the bounds of the array must
be zero or an error should be signalled. If the third argument is NIL or
omitted, each dimension of the result array is equal to either the
corresponding dimension of bit-array-1 or the corresponding dimension of
bit-array-2. The larger of the two dimensions is used when necessary to
hold all the nonzero elements of the result, otherwise either the larger
or the smaller of the two dimensions is used.
Allow BIT-NOT with a bit array as the second argument to accept arguments
of identical rank but unequal dimensions. Result elements outside the
bounds of the array must be zero or an error should be signalled.
Add the following functions:
BIT-SUBSETP bit-array-1 bit-array-2
Returns true if for every element of bit-array-1 that is 1, the
element with the same subscripts exists in bit-array-2 and is 1.
Bit-array-1 and bit-array-2 must have identical rank but need not
have identical dimensions.
BIT-DISJOINTP bit-array-1 bit-array-2
Returns true if for every element of one bit-array that is 1, the
element with the same subscripts either does not exist in the other
bit-array or is 0. Bit-array-1 and bit-array-2 must have identical
rank but need not have identical dimensions.
BIT-EQUALP bit-array-1 bit-array-2
Returns true if for every element of one bit-array that is 1, the
element with the same subscripts exists in the other bit-array and
is 1. Bit-array-1 and bit-array-2 must have identical rank but need
not have identical dimensions.
Suggest in the language specification document that compilers should
optimize the following functions when the sequence argument is declared
to be a bit-vector, taking advantage of any relevant special machine
instructions.
COUNT
POSITION
Suggest in the language specification document that compilers should
optimize the following functions when there are two arguments, the second
argument is declared to be a bit-vector, and the predicate argument is
#'ZEROP, taking advantage of any relevant special machine instructions.
EVERY
NOTANY
NOTEVERY
SOME
Proposal (BIT-ARRAY-FUNCTIONS:NO-NEW-FUNCTIONS):
Same as BIT-ARRAY-FUNCTIONS:ADD but do not add the three new functions.
Instead, generalize the mapping functions referenced above (EVERY, MAP,
NOTANY, NOTEVERY, and SOME) so that they operate on "mappables" rather
than just sequences. Define a mappable to be an array or a list.
Specify that the mappable arguments to a mapping function, and the result
in the case of MAP with a non-NIL first argument, must all be of the same
rank (the rank of a list is considered to be 1). Mapping accesses array
elements in row-major order. Generalize the existing specification that
a mapping function uses the length of the shortest sequence, to say that
a mapping function uses on each axis the minimum of the dimensions on
that axis of the mappable arguments.
Suggest in the language specification document that compilers should
optimize the functions EVERY, NOTANY, NOTEVERY, and SOME when there are
two arguments, the second argument is declared to be a bit-array, and the
predicate argument is #'ZEROP, taking advantage of any relevant special
machine instructions. In addition compilers should optimize when the
second argument is a call with two arguments to one of the binary
bit-array functions referenced above, to avoid consing an intermediate
result.
Examples:
The equivalents of UNION and INTERSECTION for sets represented
as bit vectors, with 1's in positions where set elements are
present, are BIT-IOR and BIT-AND respectively.
(COUNT 1 (THE BIT-VECTOR BV)) computes the cardinality of a bit
vector (called the population on some computers). This is
analogous to LOGCOUNT of an integer.
(POSITION BIT (THE BIT-VECTOR BV)) scans for a 1 or 0 bit, but
can likely be implemented using a word-at-a-time scan.
(EVERY #'ZEROP (THE BIT-VECTOR BV)) tests whether a bit vector
is entirely zero.
(BIT-SUBSETP bit-array-1 bit-array-2) is equivalent to
(EVERY #'ZEROP (BIT-ANDC2 bit-array-1 bit-array-2)) [under
the first proposal, only when the arguments are of rank 1.]
(BIT-DISJOINTP bit-array-1 bit-array-2) is equivalent to
(EVERY #'ZEROP (BIT-AND bit-array-1 bit-array-2)) [under
the first proposal, only when the arguments are of rank 1.]
This is analogous to NOT of LOGTEST of two integers.
(BIT-EQUALP bit-array-1 bit-array-2) is equivalent to
(EVERY #'ZEROP (BIT-XOR bit-array-1 bit-array-2)) [under
the first proposal, only when the arguments are of rank 1.]
BIT-EQUALP differs from EQUAL only when the arguments are of
unequal dimensions.
Rationale:
Three new functions are added by BIT-ARRAY-FUNCTIONS:ADD because EVERY
only works on vectors, since issue SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS was
rejected. BIT-ARRAY-FUNCTIONS:NO-NEW-FUNCTIONS includes the minimal
portion of that proposal needed to avoid adding any new functions, while
omitting all the controversial parts.
The suggestion for compiler optimization is to give users the confidence
that they will get good results when using sequence and mapping
operations on bit vectors. Otherwise we would feel the need to add
additional bit-vector-specific functions to perform these operations in a
way that is optimized and specialized for bit-vectors.
Recommending optimization of a particular way of performing these
operations avoids the problem of each implementation choosing a different
idiom to optimize, resulting in performance problems when porting.
Relaxing the requirement that bit arrays must have equal dimensions was
requested by users who had tried to use these operations on sets. The
loose specification of the result dimensions is to allow maximum
implementation freedom. This is not essential to the proposal and could
be changed to require that the result have the same dimension as the
larger of the two arguments.
Current practice:
Symbolics Genera 7.2 has something like the first proposal, but only for
bit vectors, not generalized for bit arrays. Genera has some additional
functions (BIT-VECTOR-POSITION, BIT-VECTOR-CARDINALITY, and
BIT-VECTOR-ZERO-P) that aren't really necessary since they are equivalent
to POSITION, COUNT, or EVERY plus a type declaration. The proposal seems
to fit into the rest of Common Lisp better than Genera's current practice.
Cost to Implementors:
Implementing these very efficiently may require some clever hand coding.
Of course the standard cannot mandate any particular level of efficiency
and a simple, low-cost implementation is permissible. Implementing the
compiler suggestions requires keeping track of type declarations in the
compiler, but most compilers already do that. The second proposal
requires slightly more compiler analysis than the first proposal.
A run-time type test and dispatch to code specialized for bit-arrays
could be used instead of compiler analysis, at a small efficiency cost.
Cost to Users:
None.
Cost of non-adoption:
Less featureful language. Some bit array manipulation will have to be
written in nonportable Lisp code or in C or assembly language.
Performance impact:
None on programs that don't use these features. Negligibly small on
the binary bit-array functions referenced above when array dimensions
are equal. Large improvement for programs that can take advantage of
these features when running in an implementation that optimizes them.
Benefits:
More featureful language.
Esthetics:
More featureful language.
Discussion:
This functionality was suggested on the Common Lisp mailing list
12-Jan-89. The detailed design has evolved from what was suggested and
is greatly simplified.