[Tfug] Drive recovery

John Gruenenfelder johng at as.arizona.edu
Wed Jun 3 07:41:09 MST 2009


I've recently had a drive at work begin the throes of death.  While SMART did
not warn me about any problems beforehand, it did tell me later that the drive
had run out of spare sectors.  Some random sector went bad, but the drive no
longer has any spare sectors left in its pool to remap the bad one.

This means that any attempt to read data from that sector results in a low
level read error from the drive.

To make matters worse, whatever sector (or sectors) ate it, there was
important XFS filesystem metadata there because I can no longer mount the
filesystem at all.  Bah!

Now, it seems like the immediate problem is that as soon as any program gets a
low level read error, it promptly aborts.  If I could force mount, xfs_check,
xfs_repair, or some other program to keep trying, maybe it could work?  I know
ext2/3 keeps superblock backups for emergencies.  I'm hoping that whatever XFS
data got clobbered is residing elsewhere on the drive.  But since every
program aborts right away, they never get a chance to try fallback methods (if
any exist).

This particular partition was a very large data partition.  The rest of the
system is backed up frequently, but the data drives are too large.  Therefore,
it is the users' responsibility to backup important data on their own or
direct the machine's backup system as to what is important.  But... very few
users ever bother to do this, so I need to try my best to get any data off the
drive.

So, any ideas on how I might mount it?  That would be easiest.  Alternatively,
I'll need some sort of lower-level tool to scan the drive for lost
contents... perhaps something that scans for known file types maybe?  I'm not
really sure what's available.  I did find recoverdm in the Debian repository,
but it looks to be focused on trying bad sectors on CD/DVD media and less so
on hard drives.  Another possible solution would be to copy the whole
partition into an image file and have tools work on it that way since they
would no longer have to deal with read errors.  Unfortunately, the partition
which failed was the largest one and there is currently insufficient space in
the machine to hold a copy of that partition.

I'm open to any suggestions on how to get some data off this partition.  Just,
presumably, one bad sector can't be that hard to work around, right?  :(


-- 
--John Gruenenfelder    Systems Manager, MKS Imaging Technology, LLC.
Try Weasel Reader for PalmOS  --  http://weaselreader.org
"This is the most fun I've had without being drenched in the blood
of my enemies!"
        --Sam of Sam & Max




More information about the tfug mailing list