[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Issue: BIT-ARRAY-FUNCTIONS (version 5)



This is a new issue.  It arose from an investigation of features
that are plausibly needed but missing from draft ANSI Common Lisp.
This issue seems sufficiently simple and noncontroversial that
I would like to see it on the agenda for the June X3J13 meeting.
Let's use the cleanup subcommittee to test the assertion that this
is a simple and noncontroversial issue.  If it's controversial,
let's just drop it, otherwise let's give X3J13 a chance to vote
for or against it.

Issue:         BIT-ARRAY-FUNCTIONS
References:    The binary bit-array functions BIT-AND, BIT-IOR, BIT-XOR,
               BIT-EQV, BIT-NAND, BIT-NOR, BIT-ANDC1, BIT-ANDC2, BIT-ORC1,
               and BIT-ORC2 (CLtL p.294).
               The unary bit-array function BIT-NOT (CLtL p.295).
               The mapping functions EVERY, MAP, NOTANY, NOTEVERY, and SOME
               (CLtL pp.249-250).
               The functions COUNT and POSITION (CLtL p.257).
Related issues: none
Category:      ADDITION
Edit history:  Version 1,  9-May-89, by Moon
               Version 2, 10-May-89, by Moon (add second proposal)
               Version 3, 12-May-89, by Moon (small wording improvements)
               Version 4, 13-May-89, by Moon (make more understandable)
               Version 5, 23-May-89, by Moon (fix -p naming convention)

Problem description:

  Logical operations on bit vectors have been found to be useful in such
  programs as compiler flow analysis.  They are easy to implement in
  straight Common Lisp, but such an implementation is many times slower
  than an optimized implementation on most machines.  This is partly
  because many machines have instructions to perform these operations or
  inner kernels of them, and partly because Common Lisp is not a good
  language for implementing this type of low-level bit-oriented operation.

  Common Lisp provides some logical operations on bit arrays, but the
  provided set is incomplete.  Furthermore, the operations that are
  provided are only defined for arrays of identical dimensions, making them
  less useful for bit vectors that represent sets, where trailing zero
  elements are often omitted.  Some of the sequence functions are useful
  for bit vectors, but users (correctly) fear that their implementation may
  be optimized for general sequences, not for bit vectors.

  This issue contains two alternative proposals.

Proposal (BIT-ARRAY-FUNCTIONS:ADD):

  Allow the binary bit-array functions referenced above to accept arguments
  of identical rank but unequal dimensions.  Nonexistent elements of
  bit-array-1 or bit-array-2 are assumed to be zero.  If the third argument
  is T or a bit-array, result elements outside the bounds of the array must
  be zero or an error should be signalled.  If the third argument is NIL or
  omitted, each dimension of the result array is equal to either the
  corresponding dimension of bit-array-1 or the corresponding dimension of
  bit-array-2.  The larger of the two dimensions is used when necessary to
  hold all the nonzero elements of the result, otherwise either the larger
  or the smaller of the two dimensions is used.

  Allow BIT-NOT with a bit array as the second argument to accept arguments
  of identical rank but unequal dimensions.  Result elements outside the
  bounds of the array must be zero or an error should be signalled.

  Add the following functions:

    BIT-SUBSETP bit-array-1 bit-array-2

    Returns true if for every element of bit-array-1 that is 1, the
    element with the same subscripts exists in bit-array-2 and is 1.  
    Bit-array-1 and bit-array-2 must have identical rank but need not
    have identical dimensions.

    BIT-DISJOINTP bit-array-1 bit-array-2

    Returns true if for every element of one bit-array that is 1, the
    element with the same subscripts either does not exist in the other
    bit-array or is 0.  Bit-array-1 and bit-array-2 must have identical
    rank but need not have identical dimensions.

    BIT-EQUALP bit-array-1 bit-array-2

    Returns true if for every element of one bit-array that is 1, the
    element with the same subscripts exists in the other bit-array and
    is 1.  Bit-array-1 and bit-array-2 must have identical rank but need
    not have identical dimensions.

  Suggest in the language specification document that compilers should
  optimize the following functions when the sequence argument is declared
  to be a bit-vector, taking advantage of any relevant special machine
  instructions.

    COUNT
    POSITION

  Suggest in the language specification document that compilers should
  optimize the following functions when there are two arguments, the second
  argument is declared to be a bit-vector, and the predicate argument is
  #'ZEROP, taking advantage of any relevant special machine instructions.

    EVERY
    NOTANY
    NOTEVERY
    SOME

Proposal (BIT-ARRAY-FUNCTIONS:NO-NEW-FUNCTIONS):

  Same as BIT-ARRAY-FUNCTIONS:ADD but do not add the three new functions.
  Instead, generalize the mapping functions referenced above (EVERY, MAP,
  NOTANY, NOTEVERY, and SOME) so that they operate on "mappables" rather
  than just sequences.  Define a mappable to be an array or a list.
  Specify that the mappable arguments to a mapping function, and the result
  in the case of MAP with a non-NIL first argument, must all be of the same
  rank (the rank of a list is considered to be 1).  Mapping accesses array
  elements in row-major order.  Generalize the existing specification that
  a mapping function uses the length of the shortest sequence, to say that
  a mapping function uses on each axis the minimum of the dimensions on
  that axis of the mappable arguments.

  Suggest in the language specification document that compilers should
  optimize the functions EVERY, NOTANY, NOTEVERY, and SOME when there are
  two arguments, the second argument is declared to be a bit-array, and the
  predicate argument is #'ZEROP, taking advantage of any relevant special
  machine instructions.  In addition compilers should optimize when the
  second argument is a call with two arguments to one of the binary
  bit-array functions referenced above, to avoid consing an intermediate
  result.

Examples:

  The equivalents of UNION and INTERSECTION for sets represented
  as bit vectors, with 1's in positions where set elements are
  present, are BIT-IOR and BIT-AND respectively.

  (COUNT 1 (THE BIT-VECTOR BV)) computes the cardinality of a bit
  vector (called the population on some computers).  This is
  analogous to LOGCOUNT of an integer.

  (POSITION BIT (THE BIT-VECTOR BV)) scans for a 1 or 0 bit, but
  can likely be implemented using a word-at-a-time scan.

  (EVERY #'ZEROP (THE BIT-VECTOR BV)) tests whether a bit vector
  is entirely zero.

  (BIT-SUBSETP bit-array-1 bit-array-2) is equivalent to 
  (EVERY #'ZEROP (BIT-ANDC2 bit-array-1 bit-array-2)) [under
  the first proposal, only when the arguments are of rank 1.]

  (BIT-DISJOINTP bit-array-1 bit-array-2) is equivalent to
  (EVERY #'ZEROP (BIT-AND bit-array-1 bit-array-2)) [under
  the first proposal, only when the arguments are of rank 1.]
  This is analogous to NOT of LOGTEST of two integers.

  (BIT-EQUALP bit-array-1 bit-array-2) is equivalent to
  (EVERY #'ZEROP (BIT-XOR bit-array-1 bit-array-2)) [under
  the first proposal, only when the arguments are of rank 1.]
  BIT-EQUALP differs from EQUAL only when the arguments are of
  unequal dimensions.

Rationale:

  Three new functions are added by BIT-ARRAY-FUNCTIONS:ADD because EVERY
  only works on vectors, since issue SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS was
  rejected.  BIT-ARRAY-FUNCTIONS:NO-NEW-FUNCTIONS includes the minimal
  portion of that proposal needed to avoid adding any new functions, while
  omitting all the controversial parts.

  The suggestion for compiler optimization is to give users the confidence
  that they will get good results when using sequence and mapping
  operations on bit vectors.  Otherwise we would feel the need to add
  additional bit-vector-specific functions to perform these operations in a
  way that is optimized and specialized for bit-vectors.
  Recommending optimization of a particular way of performing these
  operations avoids the problem of each implementation choosing a different
  idiom to optimize, resulting in performance problems when porting.

  Relaxing the requirement that bit arrays must have equal dimensions was
  requested by users who had tried to use these operations on sets.  The
  loose specification of the result dimensions is to allow maximum
  implementation freedom.  This is not essential to the proposal and could
  be changed to require that the result have the same dimension as the
  larger of the two arguments.

Current practice:

  Symbolics Genera 7.2 has something like the first proposal, but only for
  bit vectors, not generalized for bit arrays.  Genera has some additional
  functions (BIT-VECTOR-POSITION, BIT-VECTOR-CARDINALITY, and
  BIT-VECTOR-ZERO-P) that aren't really necessary since they are equivalent
  to POSITION, COUNT, or EVERY plus a type declaration.  The proposal seems
  to fit into the rest of Common Lisp better than Genera's current practice.

Cost to Implementors:

  Implementing these very efficiently may require some clever hand coding.
  Of course the standard cannot mandate any particular level of efficiency
  and a simple, low-cost implementation is permissible.  Implementing the
  compiler suggestions requires keeping track of type declarations in the
  compiler, but most compilers already do that.  The second proposal
  requires slightly more compiler analysis than the first proposal.

  A run-time type test and dispatch to code specialized for bit-arrays
  could be used instead of compiler analysis, at a small efficiency cost.

Cost to Users:

  None.

Cost of non-adoption:

  Less featureful language.  Some bit array manipulation will have to be
  written in nonportable Lisp code or in C or assembly language.

Performance impact:

  None on programs that don't use these features.  Negligibly small on
  the binary bit-array functions referenced above when array dimensions
  are equal.  Large improvement for programs that can take advantage of
  these features when running in an implementation that optimizes them.

Benefits:

  More featureful language.

Esthetics:

  More featureful language.

Discussion:

  This functionality was suggested on the Common Lisp mailing list
  12-Jan-89.  The detailed design has evolved from what was suggested and
  is greatly simplified.