After reading numerous posts about recovering bricked MyCloud devices (special thanks to Fox_eye), and every post related to solid white light (slightly orangish I always thought), and then thinking I’d have to resort to TTL UART Access, I managed to find a solution to my problem. Perhaps this solution applies to your device. I’ve tried to include enough relevant terms so anyone searching for a related problem in the future might find this post.
Mine is a Gen 1 (with the Debian-flavor OS), not the Gen 2 (with the Busybox-flavor OS).
Background:
I managed to lock myself out of my otherwise normally working MyCloud Gen1. By this I mean I could no longer SSH into my device. This was courtesy of a series of configurations with unintended consequences on my part. Dashboard web UI was still accessabile and ssh could be enabled/disable. That didn’t help my situation since I had edited /etc/ssh/sshd_config
to disable password auth and require public key auth. The factory sshd_config
is NOT restored during a 40 second reset.
For the really currious, I had set up some bind mountpoints (in /etc/fstab
) to appear as dirs under /home
. However, those dirs actually live on /DataVolume
so the contents survive firmware upgrades. I subsequently learned (after a reboot) that /DataVolume
itself isn’t mounted until some point after /etc/fstab
is reached during normal initialization. Ergo, /etc/fstab
could NOT bind-mount my relocated /home
dirs. Ergo no authorized_keys
file for ssh to find in a user dir. Ergo, locked out.
Problem:
At this point, the device still had it’s normal blue light. I powered down the device through the web UI and dissassembled it. I removed the HDD and put it into my SATA docking station attached my linux workstation (Fedora 31, but that probably doesn’t really matter). The drive shows up. I can easily mount the raid partitions. I edit the proper etc/ssh/sshd_config
to remove my restrictions which should allow me to ssh into the device normally.
I put it all back together and… solid white light (not blinking). I mention not blinking because blinking white light is a well-known published indicator for “device initializing”. There are no normal HDD noises. The network port on the device doesn’t indicate a link, though the link activity light flickers from time to time. The network port it’s connected to at the switch occasionally thinks there’s a link.
Furthermore, exactly every 42 seconds, the white light blinks off (for less than a second). This repeats indefinitely. For hours. I left it overnight and nothing about this changed. I read oh so many anecdotes about this happening to people. None were specific about having previously removed the drive from the enclosure and attaching it to another system like I had done, but perhaps they had.
What Happened
I pieced together what must have happened after reading lots and lots of posts, and some particularly excellent posts/replies from Fox_eye, like the one at the bottom of this post.
In one post, a contributor mentions that when attaching the HDD to a different host system (like my linux workstation, though it could be a “live” boot cd/usb linux), the raid partitions appear as /dev/md_127
instead of /dev/md0
. That was the case for me, but I hadn’t thought anything of it.
I did start thinking about the boot process and trying to capture a log of it.
I got curious about the contents of the other partitions on the HDD. Again using Fox_eye information I learned that partion5/6 hold identical copies of the kernel (it’s the same thing that appears in the file /boot/uImage
). Partition7/8 are nearly identical to each other as well. (Partitions7/8 ultimately come from the files /usr/local/share/k1m0.env
and k1m1.env
respectively). The file /boot/boot.env
also contains very similar information.
bootargs=“console=ttyS0,115200n8, init=/sbin/init”
bootargs=“$bootargs root=/dev/md0 raid=autodetect”
bootargs=“$bootargs rootfstype=ext3 rw noinitrd debug initcall_debug swapaccount=1 panic=3”
bootargs=“$bootargs mac_addr=$eth0.ethaddr”
bootargs=“$bootargs model=$model serial=$serial board_test=$board_test btn_status=$btn_status”
bootm /dev/mem.uImage
One of my two WD MyCloud Gen 1 machines looks for /dev/md0
.
The other one looks for /dev/md1
.
Apparently by attaching it to my linux workstation, it trampled the device ID the raid is known by. I’m an experienced Linux user but I know next to nothing about raid setup. I had been under the impression that all device ID’s are assigned by the system the device is attached to. Attach it to a different system, and the device will get a different device ID. Apparently not for raids? Or perhaps it was an artifact of connecting it to my workstation via SATA-USB docking station?
The Fix
I reattached the HDD to my workstation and used the information appearing on this page from Fox_eye’s Website.
mdadm --stop /dev/md_127 mdadm --zero-superblock --force /dev/sdX1 mdadm --zero-superblock --force /dev/sdX2 sync mdadm --create /dev/md0 --level=1 --metadata=0.9 --raid-devices=2 /dev/sdX1 /dev/sdX2
mdadm is the commandline tool for managing raids
insdX
, replaceX
with the actual letter your system assigns to the HDD
This recreates the raid as/dev/md0
I was already many trials-and-fails into device de-bricking, so recreating the raid was just another trial. However, it might have been possible to do something less intensive, like the following, as a first attempt.
mdadm --stop /dev/md_127 mdadm -A /dev/md0 /dev/sdX1 /dev/sdX2
At this point I believe I also “flashed” the new firmware onto the raid and the other partitions, again using instructions from Fox_eye’s Website.
- Basically, download the latest firmware directly from WD
- Unzip the download, then unpack the
*.deb
archive - Then using
dd
… - copy
rootfs.img
into/dev/md0
- copy
uImage
(the kernel) into/dev/sdX5
andsdX6
(same source for both destinations) - copy
k1m0.env
into/dev/sdX7
andkwm1.env
into/dev/sdX8
- Where to find these things in the *.deb archive and how to perform these steps is outlined in a number of posts, and again, Fox_eye’s Website.
After reattaching the main board to the HDD and connecting ethernet/power, it was apparent the device was already acting differently - normally. Normal HDD noises. The white light stayed on (again, not blinking) like it normally had, without repeating every 42 seconds.
It took about 3-4 minutes, but the blue light came on and everything worked. I could get in via ssh and do a better job of handling bind mountpoints and /etc/fstab
, more below if you’re interested.
Side Observations
MyCloud Gen1 uses runlevel 2 (init 2
). All the startup scripts can be found under /etc/rc2.d/
. The prefix S*
for start and K*
for kill (stop). /DataVolume/
isn’t mounted until S15
on mine, though I suspect the order could be different depending on which services may be configured to run.
You can enable (or permanently disable, at least until the next firmware upgrade) any services that appear in /etc/init.d
by using the command update-rc.d
. For example…
$> update-rc.d wdmcserverd disable $> update-rc.d wdphotodbmergerd disable
The old-school /etc/rc.local
is another “service” as far as the OS is concerned. This is where I had orignially put a call to mount -a
to ensure my custom bind mountpoints are reached.
/etc/rc.local
It is NOT configured to run for init 2
(nor any runlevel). Furthermore, the contents of /etc/rc.local
wouldn’t have survived firmware upgrades.
However, there is an S98user-start->/CacheVolume/user-start
which is configured to run. Like /DataVolume
, /CacheVolume
survives firmware upgrades. /CacheVolume/user-start
is a great place to put any custom startup commands or anything at all that a person would have otherwise put in /etc/rc/local
.
–edits: typos, clarity, and a bit more info