[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Bit vectors vs. fixnums



    Date: Fri, 5 Oct 90 00:03:24 EDT
    From: barmar@Think.COM

       Date: Thu, 4 Oct 90 16:20 PDT
       From: charest@AI-SUN.JPL.NASA.GOV (Charest)

       I have an application in which I want to implement a set of boolean
       properties as a "array" of bits. I am doing this to minimize the 
       memory consumed. 

       I am stuck between using bit vectors or fixnums in my implementation.
       Bit vectors seem "conceptually" appropriate, but do they eat more 
       memory than a fixnum of the same size as allocated by SCL? And are the 
       bit vector bit-manipulation functions more or less efficient than 
       the fixnum bit-manipulation functions? On the other hand, is one 
       implementation more flexible than the other?

    Fixnums take up no space, because they are implemented as immediate data
    (they take the place of the address field in the pointer).  However,
    fixnums are only useful if your array has at most 31 elements.  A bit array
    with 1 to 32 elements takes up two words -- one for the array header and
    another for the data.

    Bignums take about the same space as bit arrays with the same number of
    bits, except that bignums automatically discard high-order all-zero words.

    My guess is that the performance of the two mechanisms will be similar.  At
					    ^^^^^^^^^^^^^^
    the microcode level the objects look very similar, and it will end up doing
    the same shifts and masks to test and set bits.

By `two',  you  mean  bignum  vs.  bitvectors?  Using  fixnums should be
somewhat  faster  since  it   doesn't  have  to   go  through  all   the
header/array-dimension decoding.  Since accessors (for the fixnum/bignum
case) would likely be  substs or macros  doing things like
	(LDB (BYTE 1  PROPERTY-NUMBER) BOOLVECTOR),
I would guess that the processor  would begin on the assumption  the arg
is a fixnum  and dispatch  only if  it turned  out to  be a bignum.  The
bitvector approach would presumably  have to go  through the whole  aref
business (bit is really aref).

    One difference is in consing, though: you can perform side effects on bit
    arrays, but integers are not modifiable so you must always compute a new
    one.  For fixnums this isn't a problem, but for bignums it can add up
    pretty quickly.

VERY quickly!

    The modifiability can also affect the program design.  If multiple
    variables refer to the same bit array you can modify it and they'll all see
    the change.  If you're using integers you must always store the modified
    integer back into all the variables that expect to see the change.  You can
    get around this by encapsulating the integer in a structure, but this
    results in extra storage and slower accesses.

This is a very good point.

On the other  hand, if  the properties  only need  to be  stored in  one
place, there may be some benefits from using fixnums (or bignums)  since
they can also be treated as  numbers.     Although there are a  bunch of
special operations on bitvectors, (BIT-AND, etc.  ) there are some other
operations on integers  that may  allow you  to get  certain information
more quickly, (INTEGER-LENGTH, LOGCOUNT,  ZL:HAIPART...).     Even >,  <
comparisons could be taken to have  a meaning? And perhaps other  tricky
extractions (comp.lang.c on Usenet is a good source of such  obscurity).
It depends on  what kind  of info  you need,  how fast  they have  to be
(=how often  you  ask),  and  to  what  extent  Symbolics  has  made the
bitvector analogs  do  exactly  the  same  thing  internally  (I haven't
benchmarked).

Letting the numbers grow to bignums may be the easiest to dynamically
add properties (where they default to 0).  But the price is high.

IE. No single `right' answer.
bruce