[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Working around a flaky LMFS disk

Our main LMFS file server has developed a problem with one of its disks,
and Symbolics Customer Support has so far been unable to provide me with
any suggestions better than a full reload from backups.  I figured that
before I undertook this drastic measure, I'd see if anyone else has
dealt with this kind of problem before.

Configuration: Symbolics 3650 with five external CDC-515 disks
containing LMFS partitions, totalling about 2.3 gigabytes.

Symptom: any attempt to access unit 2, any cylinder, surface 17 results
in a %DISK-ERROR-SEARCH error.

Symbolics disk addresses are allocated in cylinder-major order, which
means that 24 adjacent addresses out of every 576 are affected, i.e. the
bad spots are scattered throughout the LMFS partition on that disk,
including the LMFS free list.

LMFS is not very good at dealing with hard disk errors in its
partitions.  The only tool available is LMFS:FIX-FILE, and it is only
prepared to deal with simple problems like an ECC error.  It drops into
the debugger when handed one of the affected files.

As I understand it, it is not possible to use SI:FIX-FEP-FILE on a LMFS
partition if you want LMFS to be able to use it afterward.  LMFS
maintains linear offsets into the partition files, so splicing out a bad
block would cause many of these offsets to be incorrect.

Has anyone got any suggestions before I reinitialize the entire LMFS and
start a reload?