[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

LispM IO hysteria



There has been much confusion about the initial complaint about the
horrendous (my word) IO rate on Symbolics Lisp Machines.  I was the person
who originally complained.  I must not have stated my problem precisely
enough, and thus confusion erupted.

My problem is simple.  I frequently need to implement algorithmically complex
programs, which manipulate large text files, in a relatively short period of
time.  The input format is streams of characters and the output format is
strings of characters.  For some of these problems, I use a CM, but the CM
has absolutely nothing to do with complaint.

Now, I have two choices.  I can implement my theories on a Lispm in lisp,
or on a Sun4 in C.  (It is useless IMHO to use C on a Lispm or lisp on a Sun.
I will never really understand why Symbolics implemented C for the Lispms,
unless it was just for kicks.  And lisp on a sun seems to have all of the
drawbacks of lisp on a Lispm without the wonderful features.)
On a LispM, I have access to all of the debugging tools, the wonderful
developing environment, and the snazzy graphical interface.  On a Sun4, I
have gdb, which is not horrible, gnu-emacs in X, which has excellent response
time (unlike Zmacs), and absolutely no free graphical interface.

So far, Lispms are slightly in the lead.  But then we hit the IO problem.
I can only waste so much time on hacking IO.  Now, don't get me wrong.  I'm
a hacker by nature and I love an interesting problem as much as the next guy.
But, to me, developing an algorithm to parse natural language sentences in real
time (which is what I do) is *far* more interesting than trying to figure out
how to read in a file in real time.

The biggest complaint I hear from most lisp people about C is that you end up
programming at such a low level that it becomes tedious.  But, for reading in
an ascii stream from stdin, all I need to do is:

void
read_stdin()
{
  char buffer[MAX_BUFFER+1];

  while (read(0,buffer,MAX_BUFFER) < 0)
    process_buffer(buffer);
}

Sorry for the C code on slug, but that's it.
No :element-type :ascii-string-char-letter-or-number-but-not-list.
No (send stream :read-xy-or-z-iop).
Just a simple read statement.  And, if it's IO from a file, I just do a fopen
and fclose.  Using this I get a few meg/minute IO rate.

Sometimes I output statistics in long ints (> 200000).  Sometimes I output
these stats in floats.  Sometimes I output strings along with these numbers.
What I can't afford to do is spend a few hours trying to find the most
efficient sequence of functions to output each different type of problem.
And I especially can't afford to output these files in a non-readable (by eye)
format, since I can't tell if my theory works unless I can read my output.

You can give me hacky solutions till the cows come home.  I appreciate the
effort, and I can even do these things myself.  I just can't indulge myself
in such pursuits.  I need IO to be a three line problem that I solve without
even a thought.  Reading and writing data needs to be as simple as reading.

All I want from Symbolics is a standard reading function that lets me give
it a pre-allocated buffer and which reads in data from disk in a time
on the same order of magnitude of a Sun.  I realize disk speeds vary.  But
disk speeds do not explain orders of magnitude slowdowns.

I realize I have been long-winded, but I wanted to be as clear as possible.
I hate to say it, but I *used* to love my Lispm.  Now, I just wish they would
fix my Sun so it can run lisp better.  Hopefully, the UX400 will be the
answer.  I am anxious to see Genera 8.0, but you will forgive me if I am
a little skeptical that much will change.

-- David Magerman
University of Pennsylvania, LINC Laboratory
** Now these are *definitely* my opinions, so don't blame anyone else here. **