[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

File System Performance Loss in Genera 8.0 (16X slower!)



    Date: Mon, 9 Jul 90 11:46 EDT
    From: ESC@RIVERSIDE.SCRC.Symbolics.COM (Eric S. Crawley)

    The second problem stems from the older hardware in
    the 36XX series.  The ethernet controller on the OBS machines (3600,
    3640, 3645, 3670, 3675) was designed before there were ethernet
    controller chips on the market.  It was not designed for the throughputs
    of todays servers and drops packets.  There isn't much we can do about
    this.  While the NBS machines (3610, 3620, 3630, 3650, etc.) have an
    ethernet controller chip (the LANC chip), they only have 2 (I think)
    hardware buffers dedicated to the chip and the microcode has to copy the
    data out of the buffers before they can receive more data.  So, NBS
    machines can drop packets when they are sent too fast.  No surprise,
    most modern Sun's can send packets this fast.  I first noticed this
    problem when measuring network performance of an XL1200 blasting to a
    3650.  The XL1200 can cause this problem for an NBS machine too.  The
    real danger here is that NFS sends the exact same message at the same
    speed everytime and if you have just enough data to get retransmitted to
    the machine everytime, without any other data to disrupt the stream,
    you can lose.  I believe this is what they are seeing at MCC.  Again,
    there isn't much we can do about this, these machines were not designed
    to have "network firehoses" opened up on them.  

One thing you could do is change the default transfer size for 36xx
clients.  It's probably safe to assume that most of the systems to which
they talk via NFS are fast enough that the large datagrams will be lost.
The main reason for requesting large blocks is to reduce the overhead of
processing requests by the server; however, a 36xx can't send requests
frequently enough to bother a Sun server.

						    Nobody was pushing data
    around that fast at the time.  Note, none of these problems affect Ivory
    based machines, as far as I know.

In my case, my 3640 is able to receive the 8K datagrams fine, but a 3650
gets lots of reassembly nodes.  And the server I've been testing with is
a Sun-3/280, not exactly a "modern Sun".  When I try it with a 4/280
server the 3640 sometimes misses one datagram per 10K file.

I think there may also be a problem in Sun's NFS server, which causes it
to fail to respond to Symbolics retry packets.  I had a network monitor
running once while a 7.2 Lispm was having trouble accessing files on a
Sun.  The Lispm was having I/O problems, causing it to miss incoming
packets frequently.  But what I noticed on the monitor was that when the
Lispm would send a repeat request for a block from the file the Sun
wouldn't respond at all.

In my previous message I suggested lowering
NFS:*LOCAL-NETWORK-TRANSFER-SIZE*.  Another important variable to set is
NFS:*LOCAL-SUBNET-TRANSFER-SIZE*.

                                                barmar