[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

CDR coding is your friend.



Well, Johan, luckily cdr-coding predates me by a few years, so I
personally feel no need to defend it on philosophical grounds.
Moreover, I don't recall anyone ever accusing me of being an elegant
programmer, so I'll happily call it a crock, too.

At Symbolics people are use to cdr-coding, and write programs which
don't use RPLACD without even thinking.  (And many people 1have0 accused
Symbolics personnel of writing programs without thinking.)  Some have
even gone off the deep end and become functional programmers.  But
that's another story.

It is true that people who are not use to this discipline find it
surprising.  What can I say?  If we slow down one function to speed up
ten, inevitably one customer will complain that the one function is
slower, even in the improbable case that other customers like the
speedup of the other ten.

Now, just to fill out the already full story:

    Date: Sat, 15 Aug 87 16:09 PDT
    From: dekleer.pa@Xerox.COM

    RPLACD's into CDR-coded lists presumably generate invis pointers to full
    cons cells.  This presumably has nasty consequences (its hard to tell
    exactly of course because the Symbolics debugging tools make it hard to
    tell exactly what the list structures you construct truely look like):

      (1) Constructing these invis pointers and consing is extremely slow.  By my crude
	  measurements its at least 10x slower than a regular RPLACD.  There is this nasty
	  function RPLACD-ESCAPE which always seems to show up with high percentages when
	  I meter.

Constructing the invisible pointers is fast.  Allocating the target cons
is relatively "slow"; which is to say it's the same speed as the CONS
function.  (Not that I expect you to give a hoot about this distinction.)

       (2) CDR'ing down this list will be slow, as invis pointers have to be chased.  So 
	   CDR is slowed down.

Right; CDR on a forwarded cell will take an extra memory reference (and
possibly an extra page fault) as compared to CDR on a normally coded
cell.  But then again, CDR on a normally coded cell is slower than CDR
on a cdr-coded cell.

But it doesn't matter, since before you finish typing your rebuttal GC
will have removed the forwarding pointer and left things just like you
hadn't ever had a forwarding pointer.

       (3) Memory utilization is bad.  After a single RPLACD on a CDR
	   coded list you've wasted 50% of the space for that list
	   element. 

Right, until GC collects the wasted space.

       (4) Question: Will GC pick up the unused fragments of cdr-coded
	   lists?  If not, bad news.  I hope the implementation doesn't
	   build long chances of invis pointers.  (I haven't checked,
	   but I've seen other situations in which this happens).

I don't understand how you could possibly get long chains (chances?) of
rplacd forwarding pointers.

    I discovered these things the hard way.  The moral is, if you use
    RPLACD, don't use CDR-coded lists ***AT ALL*** or only for trivialities.

Another possible moral is, avoid destructively modify data structures.

    And you have to read the manuals very carefully to determine what
    CommonLisp functions return a cdr-coded list and which do not.  CONS
    does not CDR-code, LIST does. I sped up one module by an order of
    magnitude by replacing all occurences of (LIST <x>) by (CONS <x> NIL).
    (Which is not so easy because many of the standard macros and functions
    implicitly use LIST's not CONS's).

I've sped up many programs that way myself.  NCONS works just as well.

    In general, I think CDR-coding is a crock, and not worth all the trouble
    it generates for the implementor(you) and user(me) just to save the few
    cons cells which take up a small percentage of the real and virtual
    memory space.

Yeah, you're right.  What's 20,000 pages anyway?  Memory's cheap.

;; In DLA's development world, after running for 5 days:
;;    7,396,449 out of 26,627,901 words (28%) of the world is list space.
;;    Scanning cdr-codes ...
;;    Found 1,010,478 normal conses taking up 2,020,956 words.
;;    5,375,493 words represent cdr-coded lists, and of these 2,131 are forwarded.
;;    5,375,493 words would be wasted by using normal coding,
;;    but 2,131 words would be saved because forwarding would not be needed,
;;    resulting in a net savings of 5,373,362 words.
;;
;; In a freshly-booted in-house 7.1 world:
;;    5,777,634 out of 13,177,857 words (44%) of the world is list space.
;;    Scanning cdr-codes ...
;;    Found 521,341 normal conses taking up 1,042,682 words.
;;    4,734,952 words represent cdr-coded lists, and of these 1,390 are forwarded.
;;    4,734,952 words would be wasted by using normal coding,
;;    but 1,390 words would be saved because forwarding would not be needed,
;;    resulting in a net savings of 4,733,562 words.

;; Note:  This function doesn't find lists which are embedded in
;; structures, such as constants in compiled-functions.
(DEFUN %-LIST ()
  (LET ((LIST 0)
	(OTHER 0)
	(NORMAL 0)
	(FORWARDED 0))
    (SI:INHIBIT-GC-FLIPS
      (SI:GC-RECLAIM-OLDSPACE)
      ;; Quick check for list space.
      (SI:MAP-OVER-REGIONS NIL NIL
	(LAMBDA (IGNORE REGION)
	  (SELECTOR (LDB SYS:%%REGION-REPRESENTATION-TYPE (SYS:REGION-BITS REGION)) =
	    (SYS:%REGION-REPRESENTATION-TYPE-LIST
	      (INCF LIST (SYS:REGION-FREE-POINTER REGION)))
	    (OTHERWISE
	      (INCF OTHER (SYS:REGION-FREE-POINTER REGION))))))
      (FORMAT T "~&~:D out of ~:D words (~D%) of the world is list space."
	      LIST (+ LIST OTHER) (ROUND (* LIST 100) (+ LIST OTHER)))
      ;; Long check for cdr-codes.
      (FORMAT T "~%Scanning cdr-codes ...")
      (SI:MAP-OVER-REGIONS NIL #'SI:REGION-PREDICATE-LIST
	(LAMBDA (IGNORE REGION)
	  (LET* ((ORIGIN (SYS:REGION-ORIGIN REGION))
		 (LIMIT (+ ORIGIN (SYS:REGION-FREE-POINTER REGION))))
	    (SI:SCANNING-THROUGH-MEMORY %-LIST (ORIGIN LIMIT)
	      (LOOP FOR ADDRESS FROM ORIGIN BELOW LIMIT DO
		(SI:CHECK-MEMORY-SCAN %-LIST ADDRESS)
		(SELECTOR (SYS:%P-CDR-CODE ADDRESS) =
		  (SYS:CDR-NORMAL (INCF NORMAL))
		  (OTHERWISE
		    (SELECTOR (SYS:%P-DATA-TYPE ADDRESS) =
		      (SYS:DTP-HEADER-FORWARD (INCF FORWARDED))))))))))
      (LET ((NORMAL-WORDS (* NORMAL 2))
	    (CODED-WORDS (- LIST (* NORMAL 2))))
	(FORMAT T "~%Found ~:D normal conses taking up ~:D words."
		NORMAL NORMAL-WORDS)
	(FORMAT T "~%~:D words represent cdr-coded lists, and of these ~:D are forwarded."
		CODED-WORDS FORWARDED)
	(FORMAT T "~%~:D words would be wasted by using normal coding,~@
		   but ~:D words would be saved because forwarding would not be needed,~@
		   resulting in a net ~:[savings~;loss~] of ~:D words."
		CODED-WORDS FORWARDED
		(< CODED-WORDS FORWARDED)
		(ABS (- CODED-WORDS FORWARDED)))))))