[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
wired-wait-timeout on 3600s
- To: TLP@MIT-OZ
- Subject: wired-wait-timeout on 3600s
- From: David A. Moon <Moon@SCRC-TENEX>
- Date: Fri ,5 Aug 83 16:33:00 EDT
- Cc: bug-3600-software@MIT-OZ, bug-3600-hardware@MIT-OZ, hoss@SCRC-VIXEN, info-lispm@MIT-OZ
- In-reply-to: The message of 5 Aug 83 15:53-EDT from TLP at MIT-AI
Date: Fri, 5 Aug 1983 15:53 EDT
From: TLP@MIT-OZ
One of our 3600's (ROBOT1) keeps dropping dead (about once a day) with
the following message from FEP:
FEP> Machine halted "Wired-wait timeout, predicate ~S false for over
~D ms. ~@
<211><211> Proceed to attempt recovery <DTP-6 20067> 72460(8)
FEP>
Diagnosis:
There are a large number of problems that all manifest themselves
with this or a similar error message. There are several software
and microcode bugs; these are fixed in Release 5. There are also
at least two hardware bugs for which ECOs have been written. These
problems are all being worked on pretty vigorously.
Getting it fixed:
It will probably take some time for the fixes to the sources of this
problem to get to you. As far as I know we have not decided to try to
retrofit software fixes for this into Release 4; Release 5 should get to
you some time in the fall. You could contact field service to check
on when they can install the hardware fixes; the phone number is
576-2524 I think (you may want to check with Tom Callahan first).
Recovery:
Normally you could type continue (at the FEP> prompt) and the
machine would recover and continue the disk operation that timed
out. That's what the "proceed to attempt recovery" is trying to
say. Unfortunately this does not work because of a bug in the
FEP software that bashes the state of the machine so that continuing
doesn't work at all.
In all cases that I have seen you can successfully recover from
a wired-wait timeout by warm booting; type start <return> at
the "FEP>" prompt.
Reports:
If you feel like reporting this, please get more information, such
as exactly what the machine was doing at the time. You will probably
find that it is impossible to get sufficiently precise information
about which was going on, in which case we have to assume you are
encountering one (or more) of the known causes of this symptom,
not a new one that we don't know about yet.