[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

benchmarks



    Date: Thu, 22 Feb 90 10:45 EST
    From: Qobi@zermatt.lcs.mit.edu (Jeffrey Mark Siskind)

    Personally, I feel that the only good measure of performance is
    ``stop-watch time''---the actual elapsed time from when I hit
    the return key until I see the results. This includes all of
    the I/O and OS overhead because my computer has to do that when
    solving my problem just as much as it has to run my code in
    user mode. For all experiments, I ran on a machine that was
    not running other user tasks. But they were presumably running
    other system processes which is the normal everyday working
    configuration of the machine. I think that what I called ``Elapsed
    time'' did actually measure ``stop-watch time.'' Accordingly,
    I think that these are the most reliable numbers to use when
    comparing benchmark results.

If you can get exclusive use of a system, this is a reasonable way to do
benchmarks.  However, the TIME macro doesn't know that you've arranged
this, so it reports the two numbers.

Traditionally, "CPU time" refers to the amount of time the processor
spends executing within a given process; in most systems this is
implemented by the scheduler noting the time when it schedules a process
to run, noting the time when it next deschedules that process, and
adding the difference to a total for the process.  On timesharing
systems this is useful since it factors out the load of the system
somewhat; a program should take approximately the same amount of CPU
time no matter how many other users are using the system.  Systems
differ in whether they include time spent in the kernel in this amount;
some systems distinguish "system time" and "user time".  Sometimes it's
useful to disregard system time, because it is more affected by
activities of other users (for example, if you're timing file access,
the system time will depend on whether another user had recently
accessed the same or a nearby file).

On single-user workstations such as the Lisp Machine it is less
necessary to factor out the affects of other processes, because the user
is believed to have more control over them than on a timesharing system.
In Genera you can use WITHOUT-INTERRUPTS or PROCESS:WITHOUT-PREEMPTS to
hog the processor when you want to factor out other processes, but
ordinary users on timesharing systems generally can't get exclusive use.
CPU time is supposed to be close to the time you'd see if yours were the
only process on the system.  If you just want to know whether 3650 or
Sparcstation-1 processors are faster at addition this is usually the
right way to measure it.

There are, of course, times when CPU time is completely worthless.  For
instance, if you're timing operations that use network facilities or
back-end processors, you don't want to factor out the time spent waiting
for the server.  Or if you want to know how well a program performs
under various system loads you want to measure its real time.

Benchmarking is very tough, in general.  Gabriel's book ("Performance
and Evaluation of Lisp Systems") discusses many of these issues.

                                                barmar