[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

LISPM I/O performance



Over the years, I've critisized LISPM I/O speeds often enough
myself, but I think your complaint is a bit off base and the
balance needs to be redressed a bit.

    I don't know about using it as a benchmark, but it is a task that
    Symbolics should address.  Processing an ascii stream is a task every
    machine should be able to perform efficiently.  

First off, I doubt you have been working with ASCII streams on
the LISPM.  Did you use :ELEMENT-TYPE 'STRING-CHAR?  If not,
then you were not working with ASCII, but with the LISPM's own
much richer character set, which involves a fair bit of additional
checking, especially when done character-at-a-time as READ does.

The amount of overhead involved depends greatly on what kind of
file access you were using.  It's been a long time since I've
done any systematic measurements, but I would expect to see ratios
of 30% - 300%, depending on exactly what kind of connections you're
working with.

I will say that your times are not at all consistent with my observations.
Using TCP and NFILE to a MacIvory from a 3640, to read the numbers 0 - 9999,
took 30 seconds using :ELEMENT-TYPE 'CHARACTER and 20 using :ELEMENT-TYPE
'STRING-CHAR, for 58890 bytes of data.  Extrapolating to 5 megabytes is only
about 43 minutes even for :ELEMENT-TYPE 'CHARACTER.

Printing to a :ELEMENT-TYPE 'STRING-CHAR stream took only 10 seconds.  We're
talking about only 12 minutes or so to dump your data, not 48 hours.

Perhaps you dumped your data as a list?  If so, watch out for the value of
*PRINT-PRETTY*!  The pretty printer can consume a *LOT* of time on large
lists, and I would not be supprised by 48 hours.  Or did you use FORMAT?
That's also slow, but not as much.

There's a lot of criticism that can be legitimately leveled against
LISPM I/O speed, and I've made my own complaints from time to time over
the years.  However, I think in your case you've blamed "LISPM I/O Slowness"
as an easy villain, and have probably missed the real culprit.

						    You shouldn't need to
    create 45 different flavors just to read in a few integers.  

I cannot figure out the relevance of this comment to the rest of your
remarks.  What do flavors have to do with this?  Am I missing something?

								 And there
    should be a documented function which performs efficient reads.  

There are lots of them.  The most appropriate for your application,
in my judgement is READ-BYTE, but that's not the only choice, either
F you really wanted high performance, I'd recommend :GET-INPUT-BUFFER
on a buffered (UNSIGNED-BYTE 8) stream and AREF, but Common Lisp
doesn't offer this kind of IO facility, so you might not want to do
that.  (Does any other Common Lisp provide this kind of high-performance
IO?)

For *REALLY* high performance, using the FEP filesystem and block IO you
can really move the data, at nearly disk transfer speeds.  I won't even
ask about other Lisp offering that kind of IO performance.

I have on several occasions, in several different lisps, including on SUN's,
rescued grossly slow programs by avoiding the use of READ.  On a SUN 3
over NFS under both Franz Allegro and KCL, I have seen an order of magnitude
speedup by replacing READ of structures with READLINE of a file in a
specific format.

    I didn't use dump-forms-to-file.  I need my data to be in a readable
    form because the application which would use the data is written in C for a SUN.
    The whole point of my problem is that the only reason I will use a
    LispM is because the programming environment helps me get applications
    up and running very quickly.  Since I am just using the LispM as a
    front end for a CM in this instance, it *shouldn't* be the bottleneck,
    for development time or run time.  It turns out that the LispM is the
    bottleneck with respect to both issues.

Frankly, it seems to me to be kind of silly to be reading this data
with a very general parser (i.e. READ) like this.  If you HAVE to have
it as ASCII (and given the size, that seems inadvisable), :STRING-LINE-IN
and PARSE-INTEGER seem like a better basis for comparison between LISP
and C.

Sometimes there's tradeoff between "quick and dirty and algorithmically
slow" and "blazing fast".  People tend to forget this when comparing LISP
and C because because in C they always have to code everything from scratch,
so the C approach is generally less general, and sometimes faster.

(It works the other way sometimes, too, of course...you tend to hesitate
before writing hash tables in C...)
    
      On a sun 3/50 this takes 
	 under a minute to manipulate with say, wc, cat, and grep '111'.
    
    Actually, we have a SUN 4/280, and using a version of egrep with a
    variation on the Boyer-Moore algorithm, I can egrep through 10meg (the
    Brown Corpus) in 3 seconds!  Try and come close to that on a LispM.

Local disk, no doubt, not NFS.  I won't even think about doing a timing
because I don't have an XL400 or UX400S, so the comparison would hardly
be fair.  And I won't allege LMFS is a speed demon.

However, compared to the speeds you're alleging for LISPM's: I'm sure it
wouldn't be all that slow, especially from the FEP filesystem.  I
routinely do searches of much larger amounts of data (40-50 MByte) using
:Find String, spread through a few hundred files, over NFILE, to a
MacIvory! It takes just a few minutes, with a large amount of that being
to open the files.

It's a *LOT* slower than EGREP, and a lot slower than it ought to be.
But if it were as slow as your complaint would indicate, it would be
something that could only be done overnight.

    My projects generally take advantage of a number of different types of
    machines (our network has LispM's, various UNIX machines, and a CM),
    and I need to be able to process the same data everywhere.  I want to
    use each machine for what it is good at, but, based on the LispM I/O
    rate, performance-wise the LispM isn't good at any aspect of my project.

You also don't tell us just what was slow.  Was it spending all of its time
running, or was it waiting for the network?  If the later, it's quite possible
you have a bad transceiver or other hardware network problem.  Or perhaps there's
some sort of software problem with NFS between your LISPM and your Sun?  Perhaps
you have eliminated these factors, but I can't tell that from your messages.