I’ve written about this topic a couple of times before. But it seems like another post is in order since my main box is now running Windows 7 and I’d like to share how Windows 7 has changed in the area of SMART monitoring.
I keep all my “data” on a separate physical hard drive so that in theory it’s easy to swap out the OS drive or upgrade it without too much fear of losing information. This also makes it easy to migrate the data to a bigger drive when I inevitably run out of space. Some time ago I did just that, switching to a 1 TB drive from a 500 GB model.
It had been running well for about a year (I think), when I looked at the SMART data and saw that the reallocated sector count was dangerously close to the limit value. I didn’t think I needed to worry about it quite yet, so I left it alone.
But a few days ago I got a call while at work: “The computer is saying something about a hard drive going bad, and I don’t know what to do.” Oh boy. So I dropped everything after finishing up a meeting and bought a replacement drive (1.5 TB this time), ready to deal with the worst.
As it turned out, things weren’t in too bad of a shape. Windows 7 itself seems to be doing a much better job at monitoring SMART status than Windows XP did:
Just for kicks, I started eventvwr.msc and looked at the System events. Sure enough, there were a couple of entries mentioning disk issues:
For more detail, I looked at SpeedFan’s output:
Yup, the reallocated sector count had reached its limit.
Since Windows 7 ships with the excellent Robocopy command line tool, I decided to use this to copy everything from the failing drive to the new one. I thought I could use a USB-to-SATA adapter I had lying around the house, but after some time I concluded the adapter must be flaky. From time to time the drive connected to it would act funny when looked at with Windows Explorer. Folders wouldn’t refresh when asked to and sometimes folders that were supposed to be on the drive didn’t show up in Windows Explorer. Also, the copy process seemed fishy when looked at with Process Explorer:
Looking at I/O Bytes History, I saw big initial spikes of traffic, then a weird drop-off, followed by a long period of inactivity between each file copy operation. Too bad I don’t remember when I bought that USB-to-SATA device, because it definitely needs to go back for a full refund or replacement.
Anyway, I ended up hooking the new drive up to a free internal SATA port inside the computer, and from then on the copy went quite smoothly. The I/O traffic pattern looked much more evenly distributed.
After finishing the copy, I checked the drive manufacturer’s website to see if the drive was still in warranty. Their online systems couldn’t tell me for some strange reason, even though I typed in the model and serial number correctly, and even the failure code that their test tool spits out. A quick call to their warranty department confirmed that it was still within warranty, and so I initiated the exchange for a fresh drive.
The final step in this was to erase the content of the old drive, just to be sure nobody could get to it, should the manufacturer’s promises of destroying the drive not come to pass. For that I used Eraser, a free tool that has many, many options for overwriting the entire drive with random data patterns, making it pretty much impossible to recover anything.
It’s always a bother when hard drives go bad, but sometimes it is possible to avert complete disaster. Windows 7 is much more proactive than its predecessor XP in terms of early detection of problems. This gives you time to move important data in time.
Of course, should a head crash happen, this would be useless, so I also have an alternate backup. No longer at Mozy, though. With my amount of data and their recent pricing changes, I decided to go with CrashPlan, partly because of their option to ship me a drive for the initial backup so I don’t have to wait for months to upload everything. The other part is that they still offer unlimited storage. And I need that.