[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

benchmarks



I don't know what the ``right thing'' to do is vis a vis CPU time vs.
elapsed time. I can only explain what I did. I used the CL TIME special
form to do all measurements. The TIME macro differs from implementation
to implementation so it is hard to really compare the numbers. For
both Lucid and Franz, I ran the benchmark just three times, each time
doing a (TIME <benchmark-form>). Lucid and Franz both report at least
two different numbers. Lucid reports ``Total Run time'' and ``Elapsed
Real Time'' which I arbitrarily christened ``CPU time'' and ``Elapsed
time'' respectively. Franz reports ``cpu time (total) user'' and
``real time'' which again, I arbitrarily christened ``CPU time'' and
``Elapsed time'' respectively. In order to get analogous numbers
(note that I said ``analogous;'' I do not claim that they measure
the same thing, only that they are similar) for Symbolics I did
six measurements, three as (TIME <benchmark-form>) and three as
(WITHOUT-INTERRUPTS (TIME <benchmark-form>)). I christened the
former as ``Elapsed time'' and the latter as ``CPU time.'' (Somewhat
latter, Michael Greenwald and Neil Mayle suggested that I use
PROCESS:WITHOUT-PREEMPTS instead of WITHOUT-INTERRUPTS since for
reasons that I only partially understand, the former should yield
better performance on Ivory-based machines)

I don't claim that what I did was correct. (Remember that I said
that benchmarking was not my profession and that I did this for
my own information and that I claim no responsibility for the
accuracy of the results.) I am just reporting what I did so that
the experiments are repeatable. In the file Benchmarks.text I
included the complete trace output of all the tests I did. If
someone wants to reanalyze the results be my guest.

Personally, I feel that the only good measure of performance is
``stop-watch time''---the actual elapsed time from when I hit
the return key until I see the results. This includes all of
the I/O and OS overhead because my computer has to do that when
solving my problem just as much as it has to run my code in
user mode. For all experiments, I ran on a machine that was
not running other user tasks. But they were presumably running
other system processes which is the normal everyday working
configuration of the machine. I think that what I called ``Elapsed
time'' did actually measure ``stop-watch time.'' Accordingly,
I think that these are the most reliable numbers to use when
comparing benchmark results.
        Jeff