[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LispM's crummy I/O performance (was RE: LispM Market Share)



  Posted-Date: Tue, 16 Jan 90 16:32:58 -0500
  To: slug@warbucks.ai.sri.com
  Subject: LispM's crummy I/O performance (was RE: LispM Market Share)
  Date: Tue, 16 Jan 90 16:32:58 -0500
  From: magerman@linc.cis.upenn.edu
  
  
  I have a quantitative response to how slow LispM I/O is and how much of
  an obstacle it can be.  I started working on a connection machine with
  a LispM front end this summer.  I wrote a simple program to perform a
  much needed task: to recognize and count all sequences of length <= n
  in a stream of tokens.  My data set included a million tokens, which
  was contained in a 5meg file.
  
  Now, anyone who has touched UNIX knows that reading in a 5meg file and
  even allocating memory for 1 million tokens (which are represented as
  integers) takes on the order of a few minutes (at most).  The LispM

Can you be more specific about this task.  I generated an almost
5megabyte file containing

1000 2000 3000 4000 5000 6000 7000 8000 9000 0000
1000 2000 3000 4000 5000 6000 7000 8000 9000 0000
...

which is the average size of your integers (tokens).  Do i now load
this into a 1 million element array?  On a sun 3/50 this takes 
under a minute to manipulate with say, wc, cat, and grep '111'.

  took almost over 5 hours to read in the data set!  This is on a 3650
  under 7.2.

What do you mean?  Something like:

(with-open-file (stream file)
  (let ((results (read stream)))
    (do-unto results)))

              What's more, the same  LispM took almost 48 hours to dump
  the data (via NFS to a disk on a SUN4)!

Do you mean with dump-forms-to-file?  If so, i'm not suprised.  A lot
of time is spend growing (copying) the hash table that maintains eqness.

  The ironic part of this is the CM took only about 15 seconds to
  actually process the data.
  
  I solved the same problem using C without the connection machine.  In
  fact, on a HP Series 800, the serial solution takes only a few hours,
  and it would take under an hour if not for paging problems at runtime.
  
  I think that 1meg/hr is not an acceptable I/O rate.
  
Perhaps your application would make a good benchmark.

k