[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LispM IO hysteria



  Posted-Date: Sun, 28 Jan 90 03:35:35 -0500
  To: SLUG@warbucks.ai.sri.com
  Subject: LispM IO hysteria
  Date: Sun, 28 Jan 90 03:35:35 -0500
  From: magerman@linc.cis.upenn.edu
  
  
  So far, Lispms are slightly in the lead.  But then we hit the IO problem.
  I can only waste so much time on hacking IO.  Now, don't get me wrong.  I'm
  a hacker by nature and I love an interesting problem as much as the next guy.
  But, to me, developing an algorithm to parse natural language sentences in real
  time (which is what I do) is *far* more interesting than trying to figure out
  how to read in a file in real time.
  
  The biggest complaint I hear from most lisp people about C is that you end up
  programming at such a low level that it becomes tedious.  But, for reading in
  an ascii stream from stdin, all I need to do is:
  
  void
  read_stdin()
  {
    char buffer[MAX_BUFFER+1];
  
    while (read(0,buffer,MAX_BUFFER) < 0)
      process_buffer(buffer);
  }
  
How about this:

(defun read-stdin ()
  (loop
    (multiple-value-bind (buffer start end)
	(send *standard-input* :read-input-buffer)
      (if buffer (process-buffer buffer start end)
	  (return)))
    (send *standard-input* :advance-input-buffer)))

  Sorry for the C code on slug, but that's it.
  No :element-type :ascii-string-char-letter-or-number-but-not-list.
  No (send stream :read-xy-or-z-iop).
  Just a simple read statement.  And, if it's IO from a file, I just do a fopen
  and fclose.  Using this I get a few meg/minute IO rate.

Well, you can use OPEN or CLOSE, or a WITH-OPEN-FILE:

(with-open-file (*standard-input* your-file :element-type '(unsigned-byte 8))
  (read-stdin))

This gives me about 2.5 megabytes/minute over an NFS link, not particulary good compared to UNIX. 
  
  Sometimes I output statistics in long ints (> 200000).  Sometimes I output
  these stats in floats.  Sometimes I output strings along with these numbers.
  What I can't afford to do is spend a few hours trying to find the most
  efficient sequence of functions to output each different type of problem.
  And I especially can't afford to output these files in a non-readable (by eye)
  format, since I can't tell if my theory works unless I can read my output.
  
  You can give me hacky solutions till the cows come home.  I appreciate the
  effort, and I can even do these things myself.  I just can't indulge myself
  in such pursuits.  I need IO to be a three line problem that I solve without
  even a thought.  Reading and writing data needs to be as simple as reading.

Well, READ-STDIN is an ~3 line program.  However, from the way you describe it,
PROCESS-BUFFER must be reasonably complicated because it has to parse your data somehow.
In C i presume you use have to use some varient of scanf.  In LISP you can use READ
which we know can be improved.
  
  All I want from Symbolics is a standard reading function that lets me give
  it a pre-allocated buffer and which reads in data from disk in a time
  on the same order of magnitude of a Sun.  I realize disk speeds vary.  But
  disk speeds do not explain orders of magnitude slowdowns.

There are many improvements possible here:

o In C, functions like scanf manipulate integers (bytes) directly.  In
  LISP, READ manipulates CHARACTERs so there is an extra translation
  step from bytes to characters which is unncessary.  Also, when we
  PARSE-INTEGER, as i presume READ does, we convert from CHARACTER
  back into BYTE to extract the digit.

o There is also translation from ASCII byte codes to LISPM byte-codes.

o The stream code is carefully layered with flavors and mixins.
  However the mixins tend to get in the way of performance.  

If C is too low level, perhaps LISP is too high.  However, we should
be able to use LISP to improve things.  For example:

o Each stream should have its own :READ method optimized for it.
  
o There should be some way to optimize out the unnecessary method
  lookup and funcalls i code we want to be fast. 

  I realize I have been long-winded, but I wanted to be as clear as possible.
  I hate to say it, but I *used* to love my Lispm.  Now, I just wish they would
  fix my Sun so it can run lisp better.  Hopefully, the UX400 will be the
  answer.  I am anxious to see Genera 8.0, but you will forgive me if I am
  a little skeptical that much will change.
  
Many share your concern.