1TB My Book Live suddenly stopped working

Ok, this has me completely stumped. I have two My Book Live, 1TB drives that have been working without any sort of problems. They are old, and if what I’m reading on the bottom are date codes, then one is the 28th week of 2009 and the other is the 18th week of 2010.

Both of them have suddenly stopped responding on the network sometime between Tuesday and Wednesday. No errors, no warnings, nothing. I was still using both of them during the day on Tuesday, and there was no indication of errors. And even so, to have both of them die seemingly at the same time? I have other NAS units on the network that are working just fine. One of them is another My Book Live and the other is a My Cloud. No issues. That can’t be coincidence. So, I started diagnosing:

  • First thing I did was shut down everything, including the internet modem, the internal router, and managed switch that the drives are plugged in to. When everything came back up, those two drives were nowhere to be found on the network.

  • Then I logged into my DHCP router to watch traffic while I restarted one of the drives. I see its MAC show up, it gets an IP, and for a brief moment I can both ping the drive as well as pull up the admin page. But within a minute it freezes and it just disappears from the network.

  • Next, I held the back button for approximately 5 seconds to reset the network details. I also deleted the static map from my router in case something’s gone weird. After restarting the drive, I watched it get a new IP, again I was able to ping it and pull up the admin page. But like before, it died within a minute.

  • Since I already have a secondary backup of the data, I went ahead and did a complete factory reset by holding the button in for approximately 30 seconds. Reboot took a little bit longer but it eventually came up again, but no such luck. The drive briefly appears on the network then just drops.

The only observation I have of the physical unit is the white lights on the front, and when the unit disappears from the network, those lights stop moving. That makes me wonder if the unit is freezing up somehow.

  • At this point I removed one of the units from the network, and connected it directly to my laptop. With the WD Link utility I was able to find its IP and pull up the admin interface. I was also able to mount the data volume and I can see everything on it. And for testing purposes, I went ahead and mounted the partition and copied all of the data to the laptop’s local storage. Again, no issues what so ever. The copy process took over an hour and all the while the admin interface was working, it never froze. I kept clicking on different links every so often.

  • I connected the drive back to the network, thinking maybe it was just a fluke but alas, no dice. It comes up briefly, then completely freezes.

Now, as I mentioned, I have other NAS units on the network and they are all working without any issues. These two are the only two that aren’t working. All the data on them is intact, the log file isn’t showing any problems with them. Their firmware is up to date and on one of them I even generated a new SSL certificate, just in case whatever came with the drive had expired and was now causing problems. Nothing.

So I’m now scratching my head.

  1. Why, or how, would two drives, same vintage, same model, stop working at about the same time? Does the software on them have some sort of a kill-switch?
  2. What else could be wrong with them that is preventing them from staying connected to the network, but yet work flawlessly when they are directly connected to a computer?
  3. What the heck do I do?

Beyond a rare instance of an internal error leading to your units being rejected by your router, I do not believe there’s a problem with them. This is confirmed by them working when connected directly to your system and taking the router out of the equation.

Perhaps it would be best to contact WD Support about this. You can do so in the following link.

http://support.wdc.com/support/case.aspx?lang=en

Be sure to explain the issue in detail, including the troubleshooting steps taken so far. They may request additional information or logs if needed.

Would that be an option though, for units that are this old and certainly out of warranty?

And, thinking about it some more, if there was an issue with the router itself, I would expect it to kick all of them out, or at least all of the same types, but it only happened with 2 out of the five that are connected (2 other WDs and 1 of a different brand.) I’m just baffled.

This seems pretty farfetched, but I wonder if your router’s firewall (if it has one) is seeing something from these 2 NAS drives that it doesn’t like. Maybe it sees a minute’s worth of traffic as a DoS attack and effectively removed the drives from your network. If so, the router’s log should show something.

Another (unlikely) possibility would be some duplication of static and dynamic (or multiple static) IP addresses in your network.

Your problem does sound seriously strange.

There’s nothing coming from the NAS drives, as far as the router is concerned. I do monitor its traffic regularly and on internal traffic nothing is getting blocked. And the IPs are all fine. Nothing has changed and the setup has been running fine for years. And with yesterday’s complete systems shutdown and restart, if there was an IP conflict, I would’ve seen it on the DHCP server’s log files. Which is what makes this so weird. I’ve gone through everything I could possibly think of, short of replacing the drives themselves and going through the (unsanctioned) process of getting the firmware on the new drives and what not. I’m going to leave that as the very last straw possibility, but not before I talk to WD, if it is an option given the age of these.

I’m about to fire up a sandbox unix server with a DHCP server on it and connect the drives to that and watch what happens. If they still fail, then I know it would be something with the drives, and not the network. I’m running out of ideas though …

This sounds really similar to the drive I’m messing with. The lights come on, but there’s nothing coming from the ethernet port, though the lights say it’s connected and the activity lights occasionally blinks. I’ve tried several cables, connecting to several routers but nothing from the drive.

The hard drive seems to be working fine, but I can’t connect to it.

EDIT:
I’m even getting the brief network appearance when you first turn it on, then it disappears.

Yep, same scenario here, they show up briefly, then disappear. When I tried an isolated DHCP setup, the same thing happened, so I know it’s not my network that’s doing it. It’s something within the drive’s firmware, whatever it may be. I also discovered that while I originally said a direct computer-to-drive connection worked, that was a bit premature. That too fails and the drive just disappears. Weird thing is, if there’s an operation happening, such as when I was copying all the data off of them, they remain connected just fine. But as soon as that operation finishes, they drop their connection. The lights on the ethernet port is still on, and blinking as if there’s activity, but that’s mainly the network itself sending broadcast packages out. There’s nothing coming or going from the drives.

The only two things I can think of here is either:
a) something in the firmware expired (like an SSL certificate, although I regenerated a new one with no luck), or
b) there’s a physical issue.

Now the first one is possible, however after having generated a new certificate from the admin front, I would’ve expected that to resolve it. It didn’t. So it’s something else.

The second option I’m thinking perhaps a component like an electrolytic cap has failed (and are faily easy to replace). While plausible, for two drives to fail at the same time like that? Both drives are the same vintage, purchased about 4 months apart, but still, what are the chances that the same component fails on both, AT THE SAME TIME. Both drives failed on the same night, while others are still running fine.

I say there’s still something fishy going on and I can’t figure out what it might be. Annoying to say the least.

Well, I figured out in my case the hard drive had failed, which then would cause the My Book Live to not work because it wasn’t able to load it’s OS off of the hard drive. I’m halfway tempted to find an old drive and test it just to see if the rest of the device is good still.

Why old drive? You should be able to put a new drive in. That was one of the things I tried as well, in case the drive was indeed going bad. Swapped it out for a brand new 3TB, put the image back on, booted it up, same deal. Shows up on the network briefly then disappears. Same scenario in a sandbox DHCP setup, which would eliminate the main network blocking them.

Well that’s unfortunate. I say old drive because I don’t want to waste money on a new one if the NAS is still dead.

Nah, it wouldn’t be wasted. I already removed the drive and stuck it in a computer on the network.

OK. Friday I had a customer that uses the 1tb mybook for backup.
Same thing happened … no network connection anymore.
When reboot it shows up for a short while then disconnects.
As it doesn’t have an USB connection it’s hard to check what it is.
I figured it was broken.

Today I am with a different customer … same thing happened with the same mybook.

What’s going on? How can we fix this?