[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Garnet 3.0 Speedup anyone? (fwd)




---------- Forwarded message ----------
Date: Thu, 29 Jun 95 17:43:42 +0200
From: clisp-list@ma2s2.mathematik.uni-karlsruhe.de
To: ritter@vpsyc.psychology.nottingham.ac.uk
Cc: barryb@dots.physics.orst.edu
Subject: Re: Garnet 3.0 Speedup anyone?

Here are some notes on optimizing Garnet in general I wrote up a few
years ago, I hope they help.

Cheers,

Frank


;; -*- Mode: Text -*- 
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; 
;; File            :writing-fast-garnet-code.txt|notes/
;; Author          : Frank Ritter
;; Created On      : Fri Jun  7 10:45:55 1991
;; Last Modified By: Frank Ritter
;; Last Modified On: Thu Apr  2 15:35:05 1992
;; Update Count    : 18
;; 
;; PURPOSE
;; 	Explains how to write fast Garnet code.
;;   Including how to cheat (modify Garnet Source).
;;
;; TABLE OF CONTENTS
;;	I. 	Introduction
;;	II.	Improving my code
;;	III.	Garnet Time/Space tradeoff
;;	IV.	Fast Garnet code
;;	V.	Modifying Garnet source
;;	VI.  	Conclusions and Future work
;; 
;;  Copyright 1991, Frank Ritter.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

I. 	Introduction

I would like a big switch that I could flip to make Garnet and other
programs run faster, or else a big whip to use on them.  I can't get
these, so I have to optimize my Garnet code by hand.  Here's how I
turned a 11700 s program into a 1770 s program (a speedup of 6.6x)
without using function timing routines (not that this is a virtue,
just to note that you could speed it up even further). 

Originally the graphic display of Soar slowed Soar down by a factor of
9.4, now Soar+display runs at 1.8 the speed (4/22/91, the last
measure).  A complicating factor is that during this optimization
process the graphic display program continued to develop.  The work
was not done in stages, so the results can not be broken down between
each optimization step.

The program that I optimized was a graphic display of Soar's internal
state, part of the Developmental Soar Interface (DSI).  Soar is a
unified theory of cognition realized as a production system, using
lisp and OPS5.  A model of natural language processing (NL-Soar) was
run on a sample sentence ("The square is grey") 10 times with a limit
of how much it could process each run (50 Soar decision cycles).  Each
run was different because Soar learned on each trial, and could thus
process more and more of the sentence.  I have no reason to believe
that Soar or the graphic display would behave differently with a
different problem, and we have noticed no gross irregularities with
respect to this analysis.


II.	Improving my code

The first thing I did was speed up my own lisp code that called
Garnet.  The Python (CMU Common Lisp) reference manual provides some
hints here, and I followed what I could.  I also read Jon Bentley's
book (ref missing from desk, probably at home) "Optimizing programs."

I checked my code and took the following steps where I could:

* put in declares for many variables
* put in proclaims for many variables and functions
* put in ignores for variables where appropriate, or else cut the
  variable out entirely
* wrapped all lambdas in #', and included declares (in CLtL2 the # is
  no longer optional)
* "declared" simple routines as 'in-line
* remove all dead, duplicate or unnecessary code that I could notice
* replaced all mapcar's where appropriate with mapc or dolist
* wrapped all lambdas in #', with declares
* remove timing code from being loaded
* Loaded a few files by command only (similar to Allegro's new Pesto
  function I think).
* set the Common lisp compiler options to produce production code:
    (proclaim '(optimize
                 (speed 3)
                 (safety 0)
                 (space 3)
                 (compilation-speed 0))


III.	Garnet Time/Space tradeoff

Running the timing test on different machines I noticed a large
difference in running time between machines.  Machines with 24 M of
memory ran much faster than those with 20 M.  This indicated to me
that the Soar + DSI + Garnet (19 M lisp image) program was space
bound, not compute bound.  So further optimization were aimed at
reducing space more than time, although both were pursued.

So at least a factor of 4.5 of the 6 was due to adding 4M more memory
to my machine.  This information was immediately useful.  For example,
when giving demos we arrange to have a machine with large enough memory.

Adding even larger amounts of memory offers only a modest speedup.
For example, running on a machine with 96 Meg only offers about an
additional 5% speed improvement over one with 32 M when working for
relatively short periods of time, and for what are now only medium
sized Soar programs.


IV.	Fast Garnet code

With the idea of saving space, I did the following: (some suggestions
from Brad Meyers included here without further reference)

1) I speed up the my code by treating created Garnet objects as
recyclable rather than opal:destroy'ing them after I was finished with
them.  This has been done for problem spaces, states, operators and
chunk explosions, but not yet for goals and chunks.

2) I removed unnecessary defuns that created new gadgets when called.
These were left over from development where I would use a defun to
create several objects at once.  What I do now is build as many
possible gadgets at load time.

3) Where possible, I replaced o-formulas with algebra in static
(status type) windows.


V.	Modifying Garnet source

  Desperate men do desperate things.

I really had to speed things up, so initially I applied most of the
lisp optimizations that I had already used on my own code to Garnet
(there were too many to count, less than in my own code, but not as
good as the Soar soarce).  Here's the actual list.  You can note that
it is a slightly different set that what I did to my own code.

A. removed whole files that I don't use

   I don't use all of the gadgets, so I stopped loading the source files
   of the following 12 unused gadgets:

  * roundtangles
  * graphics-selection
  * arrow-line
  * save-agg
  * browser-gadget
  * polyline-creator
  * graphics-selection
  * gauge
  * scrolling-input-string
  * scrolling-labeled-box
  * v-slider
  * h-slider
  
   B.  Drastic cuts

To even further speed things up I cut drastically in the Garnet
source.  I cut in several places:

1) I removed shadows from as many gadgets as I could.  In some places
I couldn't cut them out cleanly, and I may not have gotten all the
procedural references, but the structures for the most part were taken out.

I cut out two features that are terribly useful in a development
environment, but which my users wouldn't see.

2) I removed @i(if-debug) statements from interactors in my final
release with #-release-garnet.  dbprint-* are also if-debugs, so I
wrapped them too.

3) I removed all comments to non-user visible defuns and variables
with #-release-garnet.

I further cut:

4) I cut some debugging functions that weren't in the debug package
with #-release-garnet too.

5) I wrapped all demo code at the end of gadgets with #| and |#.

6) I removed all provides and requires since I don't use them, I use
something called SEM.

7) I single source the whole garnet system (using SEM), and then to
get it to compile in reasonable time, I must cut it in thirds.  This
seems to make it load and run faster, but take longer to compile.  I
like that tradeoff.

8) I removed all exports of dead code and gadgets.

9) I reduced the number of default objects in scrolling-menus, and
other objects with items.

10) I create :width and :height formulas for my aggregates based on
the subset of parts I know it will depend on, e.g., a Soar problem
space is based on its graphic triangle mostly, so that it does not
have to be computed from the more general (and slower) default
formulas.

11) I cut down the size of the halftone table in halftones.lisp.

12) I cut a lot of the object hierarchy that I don't use, such as
line-8, and motif-colors.

13) interactors.lisp includes code for 12? kinds of start-where types.
Garnet itself only uses :in-box, :in, and :element-of.  The debug
functions also use :leaf-element-of-or-none.  So I cut a lot of code
in there.

14) Turned type checking off with (opal:type-checking nil).


VI.  	Conclusions and Future work

Garnet is not so much a slow programming environment so much as it is
an extensive environment; that's why I choose it in the first place.
Users that wish to deliver interfaces based on Garnet, or who find
Garnet slow, may be well advised to not load all of Garnet, and
specifically in the case of delivery, strip down Garnet for delivery.

Garnet could be optimized more by the Garnet group, and a few of the
features implied above (i.e., recyclable objects, full use of declare)
should be added on a system level.  Providing a developmental and
delivery system may not be feasible, but as Garnet users develop more
deliverable systems, further efforts to ease the delivery of systems
appears to be warranted.

I didn't clean up all I could.  There are several loose ends that I
can enumerate:

0) The ball is now back in Garnet's court.  They should put the
proclaims and declares in, and remove the bug that declares:
"Warning: variable KR::METHOD is used yet it was declared ignored"

1) Most importantly, Garnet and the DSI should be examined with timing
routines to see where the cycles are going.

2) There are some shadows left on text-buttons somewhere.

3) I did not remove *all* doc-strings to non-user visible defuns, just
most of them.

4) I used a loop macro (the so-called Yale Loop macro).  Where it is
used I could use dolists and do's that should be faster, and I would
save the space loading yloop.  [since removed]

5) The display variable "machine-display-name" is computed too often
by my code and by Garnet's versions of the same functions.  Garnet
should set it up once, and use a variable rather than a function call.
Here's an example where I compute it, and I suspect that Garnet did
wherever I copied this from

(defun sx-event-and-lisp-loop (&optional window)
  "A loop that reads input and calls code to handle xevents
when they happen."
  (print-sx-prompt)
  ;; display is the X display to write to.
  (let ((display (if window
                     (opal::display-info-display (g-value window :display-info))
		     (let ((win1 (caar (opal::get-table-contents))))
		       (if win1
			   (xlib:window-display win1)
                           opal::*default-x-display*)))))
  ;; gnu will buffer input for us, so that's cool, and event-listen
  ;; tells us when there's an x-event.  If gnu is not there, a user
  ;; is committed to typing something if he starts to.
  (prog () start
        (cond ( (listen *standard-input*) 
		(sx-repl) )
	      ( (event-listen display)
                (opal::default-event-handler display :timeout 0) )
              ( t (sleep sx-event-and-lisp-loop-sleep-time) )      )
	(go start))
   ;; dump the event that made you quit?
   (xlib:event-case (display :discard-p t :timeout 5) ; discard current event
     (otherwise () t))))