[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

fast arefs

   Date: Fri, 8 Jan 88 00:09 EST
   From: Barry Margolin <barmar@Think.COM>
   Resent-Date: Thu 7 Jan 88 23:37:32-CST
   Resent-From: CMP.SLUG@r20.utexas.edu
   Resent-To: SLUG:;@r20.utexas.edu

       Date: Thu, 7 Jan 88 18:40 EST
       From: Dan Aronson <Dan@Think.COM>

       Does anyone know how to do really fast arefs on the lispm (even faster than
       sys:array-register).  I have complete control over my arrays, I can wire
       them down if necessary.  

       I have found sys:%make-pointer-offset and
       sys:%p-contents-increment-pointer, which probably play a role, but I don't
       know how to use them.  Has anyone had any experience trying to do this?

   Well, if you're trying to speed up iteration over an array, and the
   thing that bothers you is the repeated pusing of the index,
   sys:%p-contents-increment-pointer seems to be what you want.
   (sys:%p-contents-increment-pointer local-var) appears to be equivalent to

	   (setq local-var
		 (sys:%make-pointer-offset sys:dtp-locative
					   local-var 1))

   but it compiles into a single instruction; it's the moral equivalent of
   C's pointer++ (except that it doesn't also return a useful value).

Actually the example I saw (in the IP checksum routine) uses this as if it
returns the expected value.

   A really fast loop over an array could be written

   (loop with cur-ptr = array
	 with end-ptr = (sys:%make-pointer-offset sys:dtp-locative array
						  (1- (sys:%structure-total-size array)))
	 until (eq cur-ptr end-ptr) do
     (sys:%p-contents-increment-pointer cur-ptr)
     (do-something (location-contents cur-ptr)))

   Note that the compiled code for the above loop is only two instructions
   shorter than the loop generated by

	   (loop for x being the array-elements of array do
	     (do-something x))

   It's 8 instructions vs. 10, so I guess a 20% speedup of an inner loop is
   nothing to sneeze at, although this assumes that (do-something x) is
   trivial (like stuffing the element down the channel to the Connection

While having fewer instructions is good, it is more to the point to consider the
instructions generated.  The second form generates a call to the array accessing
microcode (ar-1) while the first uses a different primitive which presumably
does no bounds checking etc.