[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: LispM IO hysteria
Posted-Date: Sun, 28 Jan 90 03:35:35 -0500
To: SLUG@warbucks.ai.sri.com
Subject: LispM IO hysteria
Date: Sun, 28 Jan 90 03:35:35 -0500
From: magerman@linc.cis.upenn.edu
So far, Lispms are slightly in the lead. But then we hit the IO problem.
I can only waste so much time on hacking IO. Now, don't get me wrong. I'm
a hacker by nature and I love an interesting problem as much as the next guy.
But, to me, developing an algorithm to parse natural language sentences in real
time (which is what I do) is *far* more interesting than trying to figure out
how to read in a file in real time.
The biggest complaint I hear from most lisp people about C is that you end up
programming at such a low level that it becomes tedious. But, for reading in
an ascii stream from stdin, all I need to do is:
void
read_stdin()
{
char buffer[MAX_BUFFER+1];
while (read(0,buffer,MAX_BUFFER) < 0)
process_buffer(buffer);
}
How about this:
(defun read-stdin ()
(loop
(multiple-value-bind (buffer start end)
(send *standard-input* :read-input-buffer)
(if buffer (process-buffer buffer start end)
(return)))
(send *standard-input* :advance-input-buffer)))
Sorry for the C code on slug, but that's it.
No :element-type :ascii-string-char-letter-or-number-but-not-list.
No (send stream :read-xy-or-z-iop).
Just a simple read statement. And, if it's IO from a file, I just do a fopen
and fclose. Using this I get a few meg/minute IO rate.
Well, you can use OPEN or CLOSE, or a WITH-OPEN-FILE:
(with-open-file (*standard-input* your-file :element-type '(unsigned-byte 8))
(read-stdin))
This gives me about 2.5 megabytes/minute over an NFS link, not particulary good compared to UNIX.
Sometimes I output statistics in long ints (> 200000). Sometimes I output
these stats in floats. Sometimes I output strings along with these numbers.
What I can't afford to do is spend a few hours trying to find the most
efficient sequence of functions to output each different type of problem.
And I especially can't afford to output these files in a non-readable (by eye)
format, since I can't tell if my theory works unless I can read my output.
You can give me hacky solutions till the cows come home. I appreciate the
effort, and I can even do these things myself. I just can't indulge myself
in such pursuits. I need IO to be a three line problem that I solve without
even a thought. Reading and writing data needs to be as simple as reading.
Well, READ-STDIN is an ~3 line program. However, from the way you describe it,
PROCESS-BUFFER must be reasonably complicated because it has to parse your data somehow.
In C i presume you use have to use some varient of scanf. In LISP you can use READ
which we know can be improved.
All I want from Symbolics is a standard reading function that lets me give
it a pre-allocated buffer and which reads in data from disk in a time
on the same order of magnitude of a Sun. I realize disk speeds vary. But
disk speeds do not explain orders of magnitude slowdowns.
There are many improvements possible here:
o In C, functions like scanf manipulate integers (bytes) directly. In
LISP, READ manipulates CHARACTERs so there is an extra translation
step from bytes to characters which is unncessary. Also, when we
PARSE-INTEGER, as i presume READ does, we convert from CHARACTER
back into BYTE to extract the digit.
o There is also translation from ASCII byte codes to LISPM byte-codes.
o The stream code is carefully layered with flavors and mixins.
However the mixins tend to get in the way of performance.
If C is too low level, perhaps LISP is too high. However, we should
be able to use LISP to improve things. For example:
o Each stream should have its own :READ method optimized for it.
o There should be some way to optimize out the unnecessary method
lookup and funcalls i code we want to be fast.
I realize I have been long-winded, but I wanted to be as clear as possible.
I hate to say it, but I *used* to love my Lispm. Now, I just wish they would
fix my Sun so it can run lisp better. Hopefully, the UX400 will be the
answer. I am anxious to see Genera 8.0, but you will forgive me if I am
a little skeptical that much will change.
Many share your concern.