[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

LispMachine File Reading Time



    Date: Tue, 23 Jan 1990 9:09:38 PST
    From: Keith Price <price%iris.usc.edu@usc.edu>
    From: Robert W. Kerns <RWK@FUJI.ILA.Dialnet.Symbolics.COM>
    Subject: LispMachine File Reading Time.

	Did you use :ELEMENT-TYPE 'STRING-CHAR streams?  If not,
	you were in some sense comparing apples and oranges, and the
	real Symbolics times will be noticably faster.
    The manual claims that that is the Symbolics default.
No it doesn't.  At least, not anywhere I can find.
The obvious place to look is OPEN, and that makes
a very consipicuous point that Symbolics differs
from CLTL in that the default is CHARACTER instead
of "ZL-USER:STRING-CHAR" (sic).

If it says something different somewhere else, please
report it as a documentation bug so that it can be
corrected for the next release.

    I only claimed that I was using the local system (system designer) defaults - 
    designers are responsible for doing things right, if they don't they deserve 
    comments. 

I don't have any idea what you mean.  Who is responsible for doing
what right?

Users do have SOME responsibility for reading the documentation.
To me, someone trying to write top speed performance is certainly
expected to read the manual.  (In your case, of course, you have
an excuse, since either you found a buggy section of the document,
or you misread it.  I'm just responding in general about
responsibilities, not making a complaint).  Benchmarking is more
of a gray area, depending on just what you're trying to learn from
the benchmark.

However, someone writing a program, intended to be portable, which
takes a string from the user, and includes it into a file, should
not have to know that to do so he has to write
:ELEMENT-TYPE #+Symbolics 'CHARACTER @-Symbolics 'STRING-CHAR
for it to work properly on a Symbolics system.  That's why the default
is the way it is.

After all, defaults are generally intended to be the "normal,
usual case".  And defaults are often a way to aid in portability,
as in this case.

	  Date: Thu, 18 Jan 1990 8:37:02 PST
	  From: Keith Price <price%iris.usc.edu@usc.edu>
	  90%+ of the time goes to the (read).  

	Meaning?  I would guess that by this you mean that 90% of the time was spent
	inside READ exclusive of time spent inside READ-CHAR or :TYI.  Is this what
	you meant?

    I thought it was obvious, (read) is what the user sees, that is where the 
    time goes, not processing the results of the read.

I think you misunderstood my question.  I'm not talking about the RESULTS
of the READ; I'm asking whether your measurements include time spent in
the IO system, or only the time spent processing the characters.  There's
merits to either approach; excluding the time in the IO system measures
the performance of READ's parsing, which should be more constant than the
IO performance, which may vary widely depending on the source of the
characters.

       If you're going to report times including Paging time on a Symbolics machine,
       you had better compare memory sizes and disk organization.  Is the Symbolics
       machine doing paging and file I/O from the same disk drive?  Is the LMFS
       scattered around in several FEP files on one drive?  Is it fragmented?  What
       kind of disk(s)?

    Paging times were included to eliminate this from being a serious problem.  The
    word included means the times are in the total, and are given separately.

Normally this would be a good approach, but if you're also including the IO
time in the read, the paging interacts with the IO.  So it doesn't work to
just subtract the paging time.

Ain't benchmarks fun?

	Did you include file opening & closing time, or just file processing time?

    Open/Closing time is not relevant when the total is > 10 seconds.  The time is 
    the entire operation, open, read, process, close.

You'd think so, wouldn't you.  But I just measured 2.33
seconds to open and close a file on a local LMFS.  Results
will very widely depending on the size of the directory, and
whether or not it's already cached (and for each level of
directory) but I didn't pick an example intending to be
perverse.

LMFS performance in this area is very poor, and probably does
affect your results.  However, because it's highly variable, it
reduces the repeatability of your results, which renders them
less useful.  It would be more informative to report these
times separately.

Notice I didn't say anything about fairness.  I don't feel that
your test is unfair; I just think the predictive value is less
than it could be.

       I don't mean to imply that your benchmark isn't useful, but it would be a lot
       more useful if you were more careful about reporting your test conditions.
       Too often, people do good benchmarking and then spoil it by reporting piecemeal
       results.
    The test was default configurations, i.e. what the system designer thinks is 
    important.  This may be more important than what is possible.  

It's certainly more likely to predict what you'd see in a typical
application ported from another environment.

It's important to be clear about just what your test is testing;
your results apply less well to someone wanting to know whether
a particular machine can meet a particular need, if he will be
creating the application.

								   Note that my 
    main concern was that the relatively poor performance on input when compared
    to all other operations.  Even with the VAX8600 speed, it was slower on the
    other large tests and the Sun-4 was often slower than the Sun-3.

I think it would be useful to know how much of this is due
to OPEN being very slow in LMFS, how much is due to READ
being slow, and how much is due to the particular type of
stream involved (local LMFS :ELEMENT-TYPE CHARACTER in this
case) being slow.

Knowing these factors would be useful in knowing how to speed
up a particular situation.  For example, with LMFS, avoiding
the need to open up lots of little files is important.  Is it
the IO that's slow, or if I avoid READ, will my application be
sped up even more on the Symbolics than it would in the other
Lisps?

Please don't feel I'm picking on you.  You've done a better
job of benchmarking and reporting than is typical.

Doing good benchmarking is like doing a controlled sociology
experiment during a riot, except harder.