Drive replacement strategy in Raid 5 NAS

So yes, a Raid 5 configuration can survive failure of any one drive. All is fine and good as long as the failed drive is replaced and the Raid rebuilt before another failure.

Mechanical disks wear out, and if you’re starting with all disks of the same vintage, if one drive fails it wouldn’t be surprising if others have increased chance of failure. It happened to me a couple of years ago, the second failure happened before the Raid could be rebuilt. Everything lost.

What’s considered best practice for replacing drives in a Raid 5 (or 10) NAS? My thinking is that every (expected lifetime of a drive) / (# of drives in the Raid) you pull the oldest drive and replace it with a new one. (And of course always have a spare on hand in case of actual failure.)

But there doesn’t seem to be an explicit way to cleanly tell an EX4100 (or other model) that “I’m going to replace drive N.” I presume that you could shut the NAS down, replace a disk, and restart, and the NAS would figure out what to do. But WD doesn’t seem to get that concept.

Thoughts, anyone? (If this has been covered in the forums, I can’t find it.)

Before the WD bot tells you to open a support case (which won’t be answered) (which will educate noone on the forum);

I suspect the answer is “replace it; and the unit will figure it out”.
That would be if WD really thought through the process.
You may need to simply pull a drive FIRST (so the unit flags the drive first); before you go and insert a new drive.
Theoretically; you could even do all this with the unit running - - → But I would NOT try that.

Most would be scared to do that. I wouldn’t do it :slight_smile: (This is why I don’t skydive: I am not the type to jump out of a perfectly good airplane)


Regarding proactive drive replacement:

  • Remember that there are common mode failures possible on the unit that can compromise all the drives and all the data in one shot. (Virus. Fire. Rogue firmware update. User Error. Lightning). You ALWAYS want a backup of ALL data on a NAS. Do that first.

  • Hard drives; like most electronics; follow a bathtub curve - - → Most failures will happen in the first 90 days; very few happen in the intervening years; then alot fail at the same time at the end. What is the lifetime of a drive? WHO KNOWS. Depends on how you use it.

  • My current method:
    Main file: on a 1TB SSD.
    Primary backup (lesser used files): On a NAS.
    Secondary Backup (at home): On an external USB Drive. this drive will only be a 1-3 years old. Every few years; I retire the drive, store it as an archive (against virus/user error); and use a fresh USB drive.
    Disaster Recovery (not at home): External USB NOT AT HOME; updated every year or two; protects against FIRE; theft; natural disasters.

  • Replacement of NAS drives: I am a data junky; I seem to buy NAS units every once in a while: So I don’t wear them out. ** I also don’t leave them running 24/7/365. I don’t need them everyday; so I tend to run them for a few days; then shut them down for a few weeks

If I was inclined to replace regularly; probably would replace drive on drive every 1 to 2 years.

So taking the bull by the horns…

After making sure everything I cared about had been backed up, I shut down my NAS and pulled one of the drives. The NAS is still configured as Raid 5+spare. On power up, as expected, the NAS went ahead and reconstructed the Raid using the previously idle drive. Everything worked.

So then I shut down again and plugged in a “new” drive. Actually, it was an old drive, same manufacturing date as the one I pulled but never used. (My plan once the experiment was done was to reconfigure the whole thing as Raid 5 and make sure I have a spare on hand in case of failure.) Funny thing, the NAS declared the “new” drive to be corrupt and refused to use it. I had to put the “old” drive back. Humph.

As an aside, I also asked WD about a related question: What if I started replacing my 2Tb drives with 4Tb drives over time. I knew of course that half of the new drive would be wasted, but once all four were replaced I ought to be able to use their procedure, normally used to replace each drive in turn with a rebuild in between, to use the now fully-expanded drive. I finally got to talk to a tech, who responded that switching to larger drives was always a data destructive process. “But there are instructions on your site about how to do that expansion while preserving data.” He went off for about ten minutes, then came back and offered to show me the page that said I was right.

Bottom line: Yes, you can replace a drive proactively. Whether that’s a good strategy I guess could be argued. You might be able to just hot swap them, but that would be a bad idea.

And I’ll agree with NAS_user that you can never be sure that the whole system won’t go belly up, so yes, it’s important to have redundant backup for stuff you really care about.

I’m currently in that very situation. I have 4 2TB drives in RAID5. Drive is failing and I’m trying to find directions on how to upgrade the drives to 4 TB. Any help would be appreciated