[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Host uptimes



Snail Mail Address: American Microsystems Inc. (A wholly owned
		    subsidiary of Gould Inc.) CAD Research Lab. P.O.
		    Box 967 Twain Harte, CA  95383
Phone Number:       (209)586-7422

    Date: Tue, 7 Feb 89 20:04 EST
    From: Qobi@ZERMATT.LCS.MIT.EDU (Jeffrey Mark Siskind)

	Date: Tue, 7 Feb 89 14:36 PST
	From: Spock@SAMSON.CADR.DIALNET.SYMBOLICS.COM (Mr. Spock)

	    Date: Mon, 6 Feb 89 17:49 PST
	    From: TYSON@Warbucks.AI.SRI.COM (Mabry Tyson)

	    Awhile back there was some discussion about host uptimes.  One host here
	    has just passed the 1/2 year mark of being up.  (It is in an individual's
	    office and he does use it daily but he doesn't program.  Mainly he uses
	    ZMAIL, TELNET, and Image-Calc.)  Another host is up to the 5 month mark.
	    Our main file server/syshost was up 15 weeks until it crashed today.  Four
	    other hosts have been up for 2 months or more.  (We have about 31 hosts.)

	    On the negative side, we have one machine that has been down for about
	    3 weeks with a bad disk.  (Yes, we are on full maintenance and they are
	    working on it, but still...)

	We have a 3600 here in Twain Harte that (except for power outages)
	stayed up for somewhere in the neighborhood of 3 years (RSK, Correct me
	if I'm wrong).  Of course the track record has gone down since the
	installation of the IFU board set about 2 years ago but still it's a
	very reliable machine.  If it weren't for the power going out in the
	winter around here we'd probably have some pretty amazing uptimes.

    I think he was refering to time between machine crashes and not time between
    hardware breakages.

Ahhh.  Well, in that case I can't lay claim to any sort of uptime
longevity.  I'm one of those people that likes to boot a lot.  I hate
ghosts.  Of course there is the school of thought that might suggest
that I'm doing something wrong.  But, nobody's perfect.

    Funny how these messages are coming over the network just now. I have been looking
    forward to sending a message telling everybody that my machine is up
    11 weeks 6 days 4 hours 19 minutes 17 seconds, for a while now. I have been waiting
    for the machine to crash so I can get a maximum time but my machine just refuses
    to crash. But I guess that others have outdone me by far. On the otherhand,
    I do a lot of compute stuff like run SPIRE, and my own Prolog compiler --- not
    just telnet and mail. Sometimes I have to hold its hand during GC by doing
    several rounds of GC-by-area to keep it going. Right now my GC thermometer shows
    that about 95% of memory is used. I have been here before and have successfully
    recovered about 30% of memory by doing several judicious rounds of GC-by-area.
    Michael Greenwald has unofficially told me about a hack called "slow GC" which
    can do this somewhat automatically. Could somebody from Symbolics fill me in
    on the details of whether this exists in 7.2 and if not whether it will exist
    in a future release?

    Specmanship aside, I think Symbolics deserves a round of applause for improving
    both the hardware and software reliability of their products. I remember the
    days of release 4.5 when my machine would crash several times a day. I still
    have nightmares about "Page fault on unallocated VMA" or "Unrecoverable disk
    overrun" or "Lisp stopped itself". Once upon a time you couldn't run a job
    unattended overnight with more than a 10% chance of it finishing. Now I
    regularly run heavy compute bound tasks which take a whole weekend. Keep
    up the good work!
	    Jeff

I agree.