[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LispM's crummy I/O performance (was RE: LispM Market

Received: from BLAISE.kahuna.decnet.lockheed.com by ALAN.kahuna.decnet.lockheed.com via CHAOS with CHAOS-MAIL id 17386; 19 Jan 90 16:42:08 PST
Date: Fri, 19 Jan 90 16:42 PST
From: Montgomery Kosma <kosma@ALAN.kahuna.decnet.lockheed.com>
Subject: Re: LispM's crummy I/O performance (was RE: LispM Market
To: slug@ALAN.kahuna.decnet.lockheed.com
In-Reply-To: The message of 19 Jan 90 06:46 PST from sobeck@RUSSIAN.SPA.Symbolics.COM
Message-ID: <19900120004203.4.KOSMA@BLAISE.kahuna.decnet.lockheed.com>

    Date: Fri, 19-Jan-90 06:46:49-PST
    Date: Wed, 17 Jan 90 15:06 PST
    From: sobeck@RUSSIAN.SPA.Symbolics.COM

    To: "alanr%media-lab.media.mit.edu"%ELEPHANT-BUTTE.SCRC.Symbolics.COM@Warbucks.AI.SRI.COM,
    Cc: "slug%Warbucks.AI.SRI.COM"%ELEPHANT-BUTTE.SCRC.Symbolics.COM@Warbucks.AI.SRI.COM
    In-Reply-To: <9001170035.AA04103@media-lab>
    Message-Id: <19900117230620.0.SOBECK@BRAIN-DAMAGE.SPA.Symbolics.COM>

	Date: Tue, 16 Jan 90 19:35:21 EST
	From: alanr@media-lab.media.mit.edu

	      Now, anyone who has touched UNIX knows that reading in a 5meg file and
	      even allocating memory for 1 million tokens (which are represented as
	      integers) takes on the order of a few minutes (at most).  The LispM
	      took almost over 5 hours to read in the data set!  This is on a 3650
	      under 7.2.  What's more, the same LispM took almost 48 hours to dump
	      the data (via NFS to a disk on a SUN4)!
	      The ironic part of this is the CM took only about 15 seconds to
	      actually process the data.

	      I solved the same problem using C without the connection machine.  In
	      fact, on a HP Series 800, the serial solution takes only a few hours,
	      and it would take under an hour if not for paging problems at runtime.

	      I think that 1meg/hr is not an acceptable I/O rate.

	      -- David Magerman (magerman@linc.cis.upenn.edu)
	      University of Pennsylvania LINC Laboratory

	Given this horror story, I suppose that I should chip in with an io story
	that is somewhat more reasonable.

	I also use the connection machine, and needed to get access to ~ 1megabyte tiff
	image files from a networked server. These needed to be converted to
	symbolics rasters (so I could display them) and then downloaded to the cm.

	While my initial implementation was slow (order of several minutes), I was
	able to improve it substantially, to the point that I was getting a
	throughput of around 50k bytes per second over ethernet, comparable to ftp
	between the unix hosts.

    If you are willing to work with local FEP files, you can achieve performence of 
    about 250K Bytes/sec(reading or writing), not counting the time required to construct 
          ^^^^^^^^^^^^^^  !!!
    the data structures.

This is **exactly** the kind of benchmark I've been talking about!!  If
My amiga (total system cost of about $3000) does disk i/o that peaks out
higher than that!! (with a stupid seagate drive, even).  I've seen
Micropolis drives peak out at over 600 KBytes/sec.  And that's
reading/writing TEXT (HUMAN READABLE) FILES!!! 

When I'm dealing with large amounts of numerical data, the last thing I
want to do is to use some funky binary format.  Typically I get geometry
files off of a UNIX system or an IBM mainframe, process them on VMS to
get volume descriptions, then load them into the symbolics and crunch on
the connection machine.  The only way to do file interchange between
different pieces of code on different systems is to use ASCII files
which **SHOULD** run at least one or two hundred K/sec.  I think this is
totally reasonable and that the Symbolics I/O times are incredibly poor.
I couldn't believe somebody (in another slug message) talking about 40
minutes to read a 5 MB file like it was acceptable!!!!  Pure garbage!!

monty kosma