[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Timings
You also imply
that you speeded up the general function-to-function interface of KCL?
Yes I have speeded up the `general calling' of functions. In AKCL
there is no difference between calls across files or inside a file
(there is still one uncommon exception remaining but hopefully it will
go away too). I have also made it possible for many more functions
to be proclaimed, by eliminating restrictions on the args for
proclaimed functions. Here is a summary of the results from the
file doc/fast-link. The tests were on a sun 3-50.
Incidentally I would like again to thank and commend Hagiya and
Yuasa for making KCL so wonderfully extensible and publicly available.
Without their foresight, the following slight improvements would not
have been possible.
Inserting file /usr2/skcl/doc/fast-link
---Begin File /usr2/skcl/doc/fast-link---
Description of Fast Link option for KCL
Author: Bill Schelter
When we refer to times of function calls, without other qualification,
we will be referring to the simplest possible function of no args
returning nil: (defun foo () nil). This provides a good general indication
of the timing of all functions.
The original KCL function calling system, distinguishes between
functions defined in the same file, proclaimed functions, as well as
having different calling mechanisms for different safety levels.
Some disadvantages were that calling across files always took at least
50mu, in spite of proclamations or safety. Function calls inside a file
either were fast (10 mu (or 3mu for proclaimed)) at safety 0 but incapable
of being traced or redefined, or else as slow as cross file compilation.
We wished to have a scheme which would allow tracing and redefinition,
of all calls, as well very fast calling.
In order to do this we set up links in the calls, and these are modified
at the first call to the function, if the function is compiled. Recompiling
tracing, or redefining, undoes the link.
(use-fast-links t) turns this feature on, and it is on by default.
An argument of nil turns it off, so that all calls go through the function
symbol.
Some timings on the fast link compiling provided in this version of kcl.
FILEA:
(proclaim '(optimize (safety 0)))
(proclaim '(function blue() t))
(proclaim '(function blue1 (t) t))
(proclaim '(function blue2 (t t) t))
(proclaim '(function blue-same-file() t))
(defun test-blue (n)
(sloop for i below n do (blue)))
(defun test-blue1 (n)
(sloop for i below n do (blue1 nil)))
(defun test-blue2 (n)
(sloop for i below n do (blue2 nil nil)))
(defun test-blue-same-file (n)
(sloop for i below n do (blue-same-file)))
FILEB:
(defun blue () nil)
(defun blue1 (x)x nil)
(defun blue2 (x y) x y Compile and load FILEA then FILEB.
Timings: We timed the invocation of blue,blue1, and blue2
by executing the loops in fileA. We subtracted the time for
one empty loop iteration (2.7mu).
Call New Old
(blue) 3.03 60.5
(blue1 x) 4.1 62.2
(blue2 x y) 5.1 64.3
(blue-same-file) 3.03 2.73
As can be seen all calls of blue are substantially speeded up, except
for the calls in the same file, which are slightly slowed down. There
is however the advantage, that the calls in the same file can now be
traced or redefined. Also it is conceivable that the program might
want to change a definition dynamically. It is no longer necessary to
recompile the whole file. They are handled in exactly the same manner
as the non local calls.
Since most software projects consist of more than one file, and
since it is customary to move key routines to a basic files at
the beginning of the system, we feel the importance of having fast
calls across files is important. For example in MAXIMA, there are
380 calls to ptimes, with naturally the large majority being in files
other than the basic definition. It is useful if the other calls
can be made faster too. Also when debugging some chunk of MAXIMA
code, it is useful to be able to trace ptimes, without having to load
in new definitions and recompile.
Disadvantages: The link table data takes up approximately 10 words,
independent of the number of calls in a file to that function.
Space:
I made a file with
(defun try (a b) a b
(foos a b)(foos a b)(foos a b)(foos a b)(foos a b)
(foos a b)(foos a b)(foos a b)(foos a b)(foos a b)
(foos a b)(foos a b)(foos a b)(foos a b)(foos a b)
(foos a b)(foos a b)(foos a b)(foos a b)(foos a b)
(foos a b)(foos a b)(foos a b)(foos a b)(foos a b)
)
I compared the size with various settings of *fast-link-compile*
and with proclaiming foos.
DIFF means the size above the case with all calls to FOOS removed.
text data bss dec DIFF FLC proclaimed Case SAMEFILE
1076 0 28 1104 836 nil nil I nil
1308 0 32 1340 892 nil nil Ia t
1296 4 28 1328 1060 t nil II nil
1436 4 32 1472 1056 t nil IIa t
684 4 28 716 448 t t III nil
244 0 24 268 0 t ; calls removed. IV nil
384 0 32 416 0 nil ;cals removed V t
The reason II is bigger than I is that the vs_top and vs_base settings
are being performed in the file, in exactly the same manner as if the
definition for foos were in the file. FLC=nil with definition of foos
in the same file would also be higher. Should probably have a type
of proclamation which would favor the case I call in cases where speed
is irrelevant. But then why not go with III..
Appendix:
Notes:
1)Empty loop takes 2.70 seconds for 1,000,000 iterations.
2)blue-same-file or blue
>(time (test-blue 1000000))
real time : 5.750 secs
run time : 5.733 secs
NIL
>(trace blue)
(BLUE)
>(test-blue 2)
1> (BLUE)
<1 (BLUE NIL)
1> (BLUE)
<1 (BLUE NIL)
NIL
>(trace blue-same-file)
(BLUE-SAME-FILE)
>(test-blue-same-file 2)
1> (BLUE-SAME-FILE)
<1 (BLUE-SAME-FILE NIL)
1> (BLUE-SAME-FILE)
<1 (BLUE-SAME-FILE NIL)
NIL
---End File /usr2/skcl/doc/fast-link---
- References:
- Timings
- From: Jon L White <edsel!jonl@labrea.stanford.edu>