[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Fatal Disk Error urgency [1]
Date: Wed, 6 Jun 90 13:33 EDT
From: Dodds@YUKON.SCRC.Symbolics.COM (Douglas Dodds)
Date: Wed, 6 Jun 90 00:25 EDT
From: RWK@FUJI.ILA.Dialnet.Symbolics.COM (Robert W. Kerns)
Hi Bob.
[ . . . . ]
I don't know if SI:FIX-FEP-BLOCK performs the proper actions to
minimize damage to the file in which the block appears, but hopefully
the documentation will enlighten you. In general, world-load files which
get a bad block should be replaced, LMFS files should have a block of zeros
substituted,
As the other Doug says below, never use these functions on a LMFS
partition unless you intend to throw the entire LMFS away.
and paging files can just have the block removed (while the
file is not in use!)
Ditto world files: Never use these fuctions on files that are in use.
The 8.0/7.4/7.2 documentation for SI:FIX-FEP-BLOCK and SI:FIX-FEP-FILE is
sketchy at best. I distributed a more comprehesive guide to the field
service people around Jan 89.
These functions will do a read-only scan of a block or fep file for ECC
errors. If an error is found, the user will be queried about performing a
write/read test. The write/read tests will rewrite the block using
repaired (but probably incorrect) data from the block and and reread the
block to see if the error still exists. If there is still an error in the
block, then the user will be queried with several choices:
1. DELETE: This option is supposed to remove the bad block from the file,
add it to the bad blocks list, and then delete the file. It should be
used on world load files when you know the data in the bad blocks has
been trashed. The file should be expunged from the disk file system
and reloaded from tape (or recreated).
2. SPLICE: This option is supposed to remove the bad block from the file
and add it to the bad blocks list. The file's data map is spliced back
together, leaving an intact working file. This option can be used to
repair trashed paging files.
3. ZERO: This option is supposed to remove the bad block from the file,
allocate a new block from the free map, splice that block into the
file, and write 0's into it. This option can also be used to repair
trashed paging files.
4. COPY: This option is supposed to remove the bad block from the file
while retaining the original data in the block (which is probably
damaged in some way), allocate a new block from the free map, splice
that block into the file, and write the original data back into it.
This option could be used if you feel the original data has not been
trashed. Good Luck.
Due to minor buggyness in past releases, I usually recommend that if a
hard ECC error is found, one should use the SPLICE option, delete and
expunge the file, and recreate it. This insures that the bad block is
indeed removed from use and is not allocated to both the original file and
the bad blocks file.
In all cases, run the function SI:VERIFY-FEP-FILESYSTEM to make sure
everything is clean.
The documentation states, and I agree, that for safety, you should never
use SI:FIX-FEP-BLOCK or SI:FIX-FEP-FILE on LMFS partitions. Instead,
use LMFS:FIX-FILE, which gives you the right pathname-based handles on
the file, and limits the options to those that are safe for the
integrity of LMFS partitions.
To gain more information on errors and their locations, these functions
can be safely run on any disk file (including LMFS partitions) 1AS LONG AS
NO POSITIVE ACTION IS TAKEN0. Say 1NO0 to all option queries.